1) What will you do when there is a delay in the indexer?
2) How long the delay period is? (Any maximum time cap is there or will you wait for the complete delay to be cleared in indexer)
3) Will you send any notifications regarding the indexer delay?
If yes i) What are the information you can include in that notification (Like any tentative time for the next alert schedule)
ii) If there is a continuous delay, so you missed 2-5 time intervals, can you send mail for each time period or a single mail with all the information?
4) If there is 2 hours delay in the indexer, did you check for the missed intervals after the delay is cleared, or else check only from the current time period? (For example, RunFrequency is 5 mins and there is a delay from 10 AM and it is cleared at 11 AM. Did you scan from 10 AM or from 11 AM?)
Let me add one thing to your response. The delay between _indextime and _time does not necessarily reflect the delay on splunk processing. While _indextime is generated by splunk (and you do have your time synchronized across your splunk environment, right?), the _time value might be parsed from the event so it might not reflect the actual time the event was generated (because the time on the source time is wrong) or the time that the event was actually "picked up" by the whole ingesting pipeline because - for example - you might be batch-reading data for a whole day from a file that gets synced from a source once a day.
So the _time-_indextime delay might indicate many things, not just splunk-induced delays.
In my case there was a batch of sources which had around 2h delay because of wrongly set timezone. And for many of windows sources the delay was around 10-15 minutes because the events were retreived using WEF and it works by pulling events periodically.
Hi @Daniel11,
in log indexing there's usually a little delay (usually max 30 seconds) because Forwarders send packets with a configurable frequency (default 30 seconds and usually this value isn't modified).
If you have more dalay between ingestion and indexing, there's something to analyze:
First and second problems must be solved outside Splunk.
The third requires an analysis and it's possible to change some configuration parameters, e.g.:
Anyway you can analyze delays with a simple search like this:
index=index
| eval delay=_indextime-_time, indextime=strftime(_indextime,"%Y-%m-%d %H:%M:%S.%3N")
| table _time indextime delay
and put a threeshold to use in an alert, e.g. a delay greater than 60 seconds:
| where delay>60
so, answering to your questions:
I hope to have answered to all you questions.
Ciao.
Giuseppe