how to find Number of files failed to ingest for a...

athorat · ‎12-31-2018

How do I find the Number of files failed to ingest for a specific Index.
Trying to compare files ingested vs files failed to ingest for a specific Index in Splunk.

woodcock · ‎12-31-2018

If you have a list of files and how many events are in them, then you can do something like this and cross-reference:

| tstats count AS EventsInThisFile WHERE index=YourIndexNameHere BY source

richgalloway · ‎12-31-2018

Are you looking for files that were once successfully ingested and are no longer being read or files that were never ingested at all?

The former case is a matter of searching some long period (like 30 days) to build a list of expected files then searching a short period (like today) to build a list of current files and comparing the two.

The latter case is more challenging. A source that was never read will not be in Splunk, but you may find an error message in _internal for files that could not be read, perhaps because of permissions. It's possible, of course, for a file to be silently skipped if it's not part of the monitor pattern, for instance.

Please clarify your requirements and we'll try to help.

---
If this reply helps you, Karma would be appreciated.

athorat · ‎01-03-2019

Thanks for replying on this @richgalloway,aplogize for the delayed reply. yes we are trying to find the files which never reached splunk.
1 by permissions issue
2 OR by files not matching the whitelist pattern or Unknown reasons.

We have have got the count of files per index which are being read/indexed by Splunk UF
| tstats dc(source) WHERE host=10a*pd-* OR host=ew1a-* OR host=dub01pd-* OR host=uw2*-* OR host=ue1-* index=prod-online* by index

Failed attempt Below:
Now we want to list the number of files which are errored out / Not read by the Splunk UF. So for this the same hosts are being used to filter but how do we get that by Index name and have a bar chart comparing the above query?
index=_internal sourcetype=splunkd splunk_server=usw* host=10a*pd-* OR host=ew1a-* OR host=dub01pd-* OR host=uw2*-* OR host=ue1-* log_level=ERROR
| rex field=message "((?.*))"| stats dc(message) by host | sort – message

richgalloway · ‎01-04-2019

Sources that fail are not indexed so you can't get stats by index. I suggest generating stats by source or host.

---
If this reply helps you, Karma would be appreciated.

how to find Number of files failed to ingest for a specific Index

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms