Solved: Re: Log Ingestion Failure by Sourcetype

b17gunnr · ‎02-03-2026

Hello folks,

I have a compliance control requirement to alert when there is a log ingestion failure to Splunk. The desire is to focus at the sourcetype level as opposed to the host level (too many false positives) or index level (loses granularity as sourcetypes increase). The keys to the requirement are to dynamically expand as new sourcetypes come online and Splunk's results must consider the frequency of events on a per sourcetype basis. For example, a generic 4-hour window wouldn't suffice for a sourcetype getting multiple events every second, nor would it be properly handle a sourcetype that receives events once or twice per day.

I've tried the Meta Woot app and while beneficial for other issues, it does not address the control requirements. Has anyone developed a query with reasonable performance times, or found another app to handle compliance logging failures that considers the variance in event frequency and not an absolute?

Thanks!

bowesmana · ‎02-04-2026

I hadn't realised they had switched to a licenced model.

The basic idea behind a roll your own technique is to have a lookup file that contains the index and sourcetype and threshold in seconds that you need to see data in - you can create a simple example of all index sourcetype pairs seen in the previous hour and give them a threshold of 10 minutes, e.g. like this

| tstats latest(_time) as last_seen count where index=* earliest=-1h@h latest=@h by index sourcetype
| sort index sourcetype
| table index sourcetype
| eval threshold=600
| outputlookup monitor.csv

Now you have a control set that you use to look for missing data outside the threshold.

Now initialise the results file - makes the SPL easier if all data is present there at the start.

| inputlookup monitor.csv 
| fields - threshold
| eval last_seen=now(), missing_data=0
| outputlookup monitor_results.csv

Then you can run this as a scheduled alert at the frequency you want - this example you can run every minute.

| tstats max(_time) as last_seen count where [ | inputlookup monitor.csv | fields index sourcetype ] earliest=-1m@m latest=@m by index sourcetype
``` We have data so reset missing indicator ```
| eval missing_data = 0
``` Grab all previous results and combine with what we found ```
| inputlookup monitor_results.csv append=t
| fields - threshold
| stats first(*) as * by index sourcetype
``` Get the threshold and see if the last seen exceeds the configured threshold ```
| lookup monitor.csv index sourcetype OUTPUT threshold
| eval exceeds_threshold = if(now() - last_seen > threshold, 1, 0)
``` Now work out if we need to alert - only alert the first time we exceed the threshold ```
| eval alert=if(exceeds_threshold = 1 AND missing_data = 0, 1, 0)
``` Increment the missing data counter to avoid continual alerts ```
| eval missing_data=if(exceeds_threshold = 1, missing_data + 1, missing_data)

``` Write out these results ```
| outputlookup monitor_results.csv
| where alert = 1

The logic for that is

Search data in the previous minute for index/sourcetypes you want
Add in the results you have collected from previous searches
Take the first event by index/sourcetype, i.e. keep all the found results and retain only the previous results where you have no current result.
Lookup the threshold configured for this index/sourcetype - in seconds
Find out if it was last_seen more than that threshold ago
Alert the first time it is exceeded
Save all the results back to the results CSV
And then retain only those items you want to alert on

This will alert the first time an index/sourcetype has not been seen for the given number of threshold seconds.

NB: This is a starting point, but gives you the principles of how to manage it.

Hope this helps

View solution in original post

bowesmana · ‎02-03-2026

The TrackMe app is powerful and would do what you want - requires a bit of investment in time to set it up.

https://splunkbase.splunk.com/app/4621

I've rolled my own with a regular saved search that uses tstats to collect index/sourcetype pairs and saves the results to a lookup, calculating the average latency and min/max gaps between events for each. Alerts then run to check current ingestion against those metrics per index/sourcetype.

There's an investment in time either way - but TrackMe is a good place to start.

b17gunnr · ‎02-04-2026

Unfortunately, this solution exceeds my budget.

bowesmana · ‎02-04-2026

I hadn't realised they had switched to a licenced model.

The basic idea behind a roll your own technique is to have a lookup file that contains the index and sourcetype and threshold in seconds that you need to see data in - you can create a simple example of all index sourcetype pairs seen in the previous hour and give them a threshold of 10 minutes, e.g. like this

| tstats latest(_time) as last_seen count where index=* earliest=-1h@h latest=@h by index sourcetype
| sort index sourcetype
| table index sourcetype
| eval threshold=600
| outputlookup monitor.csv

Now you have a control set that you use to look for missing data outside the threshold.

Now initialise the results file - makes the SPL easier if all data is present there at the start.

| inputlookup monitor.csv 
| fields - threshold
| eval last_seen=now(), missing_data=0
| outputlookup monitor_results.csv

Then you can run this as a scheduled alert at the frequency you want - this example you can run every minute.

| tstats max(_time) as last_seen count where [ | inputlookup monitor.csv | fields index sourcetype ] earliest=-1m@m latest=@m by index sourcetype
``` We have data so reset missing indicator ```
| eval missing_data = 0
``` Grab all previous results and combine with what we found ```
| inputlookup monitor_results.csv append=t
| fields - threshold
| stats first(*) as * by index sourcetype
``` Get the threshold and see if the last seen exceeds the configured threshold ```
| lookup monitor.csv index sourcetype OUTPUT threshold
| eval exceeds_threshold = if(now() - last_seen > threshold, 1, 0)
``` Now work out if we need to alert - only alert the first time we exceed the threshold ```
| eval alert=if(exceeds_threshold = 1 AND missing_data = 0, 1, 0)
``` Increment the missing data counter to avoid continual alerts ```
| eval missing_data=if(exceeds_threshold = 1, missing_data + 1, missing_data)

``` Write out these results ```
| outputlookup monitor_results.csv
| where alert = 1

The logic for that is

Search data in the previous minute for index/sourcetypes you want
Add in the results you have collected from previous searches
Take the first event by index/sourcetype, i.e. keep all the found results and retain only the previous results where you have no current result.
Lookup the threshold configured for this index/sourcetype - in seconds
Find out if it was last_seen more than that threshold ago
Alert the first time it is exceeded
Save all the results back to the results CSV
And then retain only those items you want to alert on

This will alert the first time an index/sourcetype has not been seen for the given number of threshold seconds.

NB: This is a starting point, but gives you the principles of how to manage it.

Hope this helps

b17gunnr · ‎02-09-2026

Excellent starting point, very much appreciate the suggestion and the level of detail explaining the thought process.

livehybrid · ‎02-03-2026

Hi @b17gunnr

I think creating a search yourself might end up being clumbersome and hard to cover the variance. Have you seen the Splunkbase app TrackMe?

TrackMe is a good for monitoring anomalies in ingestion (per host/sourcetype etc) and looks at things like event count, size, frequency, lag/delay etc.

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

b17gunnr · ‎02-04-2026

This solution is outside my current budgeting options.

Log Ingestion Failure by Sourcetype

other

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

May 2026 Splunk Expert Sessions: Security & Observability

Network to App: Observability Unlocked [May & June Series]

Join the Conversation