Installation

how to find Number of files failed to ingest for a specific Index

athorat
Communicator

How do I find the Number of files failed to ingest for a specific Index.
Trying to compare files ingested vs files failed to ingest for a specific Index in Splunk.

0 Karma

woodcock
Esteemed Legend

If you have a list of files and how many events are in them, then you can do something like this and cross-reference:

| tstats count AS EventsInThisFile WHERE index=YourIndexNameHere BY source
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Are you looking for files that were once successfully ingested and are no longer being read or files that were never ingested at all?

The former case is a matter of searching some long period (like 30 days) to build a list of expected files then searching a short period (like today) to build a list of current files and comparing the two.

The latter case is more challenging. A source that was never read will not be in Splunk, but you may find an error message in _internal for files that could not be read, perhaps because of permissions. It's possible, of course, for a file to be silently skipped if it's not part of the monitor pattern, for instance.

Please clarify your requirements and we'll try to help.

---
If this reply helps you, Karma would be appreciated.
0 Karma

athorat
Communicator

Thanks for replying on this @richgalloway,aplogize for the delayed reply. yes we are trying to find the files which never reached splunk.
1 by permissions issue
2 OR by files not matching the whitelist pattern or Unknown reasons.

We have have got the count of files per index which are being read/indexed by Splunk UF
| tstats dc(source) WHERE host=10a*pd-* OR host=ew1a-* OR host=dub01pd-* OR host=uw2*-* OR host=ue1-* index=prod-online* by index

Failed attempt Below:
Now we want to list the number of files which are errored out / Not read by the Splunk UF. So for this the same hosts are being used to filter but how do we get that by Index name and have a bar chart comparing the above query?
index=_internal sourcetype=splunkd splunk_server=usw* host=10a*pd-* OR host=ew1a-* OR host=dub01pd-* OR host=uw2*-* OR host=ue1-* log_level=ERROR
| rex field=message "((?.*))"| stats dc(message) by host | sort – message

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Sources that fail are not indexed so you can't get stats by index. I suggest generating stats by source or host.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Get More Out of Your Security Practice With a SIEM

Get More Out of Your Security Practice With a SIEMWednesday, July 31, 2024  |  11AM PT / 2PM ETREGISTER ...