I'm having difficulty with my realtime alert. When the alert is triggered, it gives an average of 109, but when I view the events, the average is 86. Note that the time range is the same. What have I gotten wrong with my alert configuration?
Trigger result, and events of that result (2 images merged):
Trigger Condition:
This is one of the main reasons that I absolutely HATE real-time
searches. Almost everyone using them makes an unconscious assumption that there is no latency in the events as they make their way into Splunk, but that is NEVER the case! If you have a 1 minute window and many events have a latency between 1 and 2 minutes, then you will miss most of them in your real-time
search/alert. But then when you go back to look later (even 1 minute later), they will be in the indexers and so you will get more events returned and very different values.
All events have a field called _indextime
which is the time the event was indexed. You can start with a search extension like this:
... | eval lagSeconds=_indextime - _time | stats min(lagSeconds) max(lagSeconds) avg(lagSeconds)
Now for each event, you have the Splunk latency characteristics to PROVE that your eventLatency is bigger than your real-time
window. There are HUGE problems with real-time
searches and by the time I get done explaining all of them to a client, 100% of the time, we ditch real-time
and architect a different approach.