Knowledge Management

Scheduled Summary Index searches not firing on time

stephanbuys
Path Finder

We use summary indexing to improve search performance and to avoid unnecessary lookups and field extractions. It is supposed to run every 5 minutes and summarize the previous 5 minute window.

We schedule the saved search values:

earliest = -10m@m
latest = -5m@m
frequency = every 5 minutes

When investigating

index="_internal" sourcetype="scheduler"
it becomes apparent that the scheduler is not firing our saves searches reliably every 5 minutes. Sometimes a search will only start 6 or 7 minutes after the previous search. This creates small gaps in the data (of 1 or 2 minutes) that is impossible to backfill with the backfill script provided. Also, it renders the summary index useless.

Is there a way to snap to a more accurate 5 minute window? Or a way to force the scheduler to run more reliably?

1 Solution

Lowell
Super Champion

What's your setting for realtime_schedule in your savedsearches.conf entry?

I think in more release release creating a new summary indexing generating scheduled saved search now causes realtime_schedule to be set to 0. Generally this is what you want since this means that any missed runs get executed later (for example, in the scenario of a splunkd restart). This also means that these saved searches could be delayed; however, this should not result in gaps in your summary index, this should help prevent them.

If you search your summary index for your summary events in question, you should see that search_now should always reflect the precise 5 minute interval you have scheduled your searches for, where as info_search_time will reflect the real (wall clock) time, which is when the search was actually kicked off. So basically, even though your search was delayed by a minute or two (which does seem high), you shouldn't be losing any data because each search should still cover the originally designated window.

You may also want to look into your limits.conf settings as far as concurrency of saved searches and all that. (I think there are some questions about that flowing around on this site already.)


BTW, are you seeing your saved search show up as being "skipped", because then I would expect to see events being dropped. You can search with:

index="_internal" sourcetype="scheduler" status=skipped

Another thing to consider: Is it possible that you simply don't have any events to summarize for the 5 minute window in question? If this happens, you will see no new events in the summary index (which looks like a "gap"). This may or may not be likely based on your event data, but you should be able to confirm this very quickly with the search:

index="_internal" sourcetype="scheduler" result_count=0

Of course, if you have some sort of conditional logic, then perhaps this would be a better search:

index="_internal" sourcetype="scheduler" NOT alert_actions="*summary_index*"

View solution in original post

Lowell
Super Champion

What's your setting for realtime_schedule in your savedsearches.conf entry?

I think in more release release creating a new summary indexing generating scheduled saved search now causes realtime_schedule to be set to 0. Generally this is what you want since this means that any missed runs get executed later (for example, in the scenario of a splunkd restart). This also means that these saved searches could be delayed; however, this should not result in gaps in your summary index, this should help prevent them.

If you search your summary index for your summary events in question, you should see that search_now should always reflect the precise 5 minute interval you have scheduled your searches for, where as info_search_time will reflect the real (wall clock) time, which is when the search was actually kicked off. So basically, even though your search was delayed by a minute or two (which does seem high), you shouldn't be losing any data because each search should still cover the originally designated window.

You may also want to look into your limits.conf settings as far as concurrency of saved searches and all that. (I think there are some questions about that flowing around on this site already.)


BTW, are you seeing your saved search show up as being "skipped", because then I would expect to see events being dropped. You can search with:

index="_internal" sourcetype="scheduler" status=skipped

Another thing to consider: Is it possible that you simply don't have any events to summarize for the 5 minute window in question? If this happens, you will see no new events in the summary index (which looks like a "gap"). This may or may not be likely based on your event data, but you should be able to confirm this very quickly with the search:

index="_internal" sourcetype="scheduler" result_count=0

Of course, if you have some sort of conditional logic, then perhaps this would be a better search:

index="_internal" sourcetype="scheduler" NOT alert_actions="*summary_index*"

Lowell
Super Champion

Yeah, that can be tricky to spot. I assume you know about the _indextime field (add in 4.0), which can be quite helpful in tracking down this kind of issue. I think the general rule of thumb is to simply delay your summary indexing searches to the point at which you are certain all your events are loaded, but that may not be an option for you. (The file polling / indexing performance of 4.1 is much better than earlier versions, so if your running an older version and your mostly looking at events coming from log files, then upgrading may help here.) Best of luck!

0 Karma

stephanbuys
Path Finder

We think we found our issue, some of the events get logged a lot later, but has a timestamp that sometimes falls in a Summary Indexing window that has already passed. At least we can confirm that Summary Indexing seems to work reliably. Will raise a new question for this backfill challenge. Thanks!

0 Karma

Lowell
Super Champion

Is it possible that no events occurred with a 5 minute window? I've added a search above to check for that.

0 Karma

stephanbuys
Path Finder

realtime_schedule is set to 0 for the saved searches in question.

0 Karma

stephanbuys
Path Finder

I found some skipped saved searches using your search, but not for the day in question. I verified that the scehduled search events's scehduled_time field was correct (ie. 5 minute intervals).
Will need to dig deeper to find out why our summary index is missing events.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...