Archive
Highlighted

how to find Number of files failed to ingest for a specific Index

Communicator

How do I find the Number of files failed to ingest for a specific Index.
Trying to compare files ingested vs files failed to ingest for a specific Index in Splunk.

0 Karma
Highlighted

Re: how to find Number of files failed to ingest for a specific Index

SplunkTrust
SplunkTrust

Are you looking for files that were once successfully ingested and are no longer being read or files that were never ingested at all?

The former case is a matter of searching some long period (like 30 days) to build a list of expected files then searching a short period (like today) to build a list of current files and comparing the two.

The latter case is more challenging. A source that was never read will not be in Splunk, but you may find an error message in _internal for files that could not be read, perhaps because of permissions. It's possible, of course, for a file to be silently skipped if it's not part of the monitor pattern, for instance.

Please clarify your requirements and we'll try to help.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Highlighted

Re: how to find Number of files failed to ingest for a specific Index

Communicator

Thanks for replying on this @richgalloway,aplogize for the delayed reply. yes we are trying to find the files which never reached splunk.
1 by permissions issue
2 OR by files not matching the whitelist pattern or Unknown reasons.

We have have got the count of files per index which are being read/indexed by Splunk UF
| tstats dc(source) WHERE host=10apd- OR host=ew1a-* OR host=dub01pd-* OR host=uw2- OR host=ue1-* index=prod-online* by index

Failed attempt Below:
Now we want to list the number of files which are errored out / Not read by the Splunk UF. So for this the same hosts are being used to filter but how do we get that by Index name and have a bar chart comparing the above query?
index=internal sourcetype=splunkd splunkserver=usw* host=10apd- OR host=ew1a-* OR host=dub01pd-* OR host=uw2- OR host=ue1-* log_level=ERROR
| rex field=message "((?.*))"| stats dc(message) by host | sort – message

0 Karma
Highlighted

Re: how to find Number of files failed to ingest for a specific Index

SplunkTrust
SplunkTrust

Sources that fail are not indexed so you can't get stats by index. I suggest generating stats by source or host.

---
If this reply helps you, an upvote would be appreciated.
Highlighted

Re: how to find Number of files failed to ingest for a specific Index

Esteemed Legend

If you have a list of files and how many events are in them, then you can do something like this and cross-reference:

| tstats count AS EventsInThisFile WHERE index=YourIndexNameHere BY source
0 Karma