False alert - delay in log writing?


We are getting a random false alert from a Splunk (6.5.2) search that's looking for if a certain string is not found in a log file within the last 15m.

When we did an investigation and tried to search, the string was there for the alert period, so it shouldn't have triggered any alert.

We couldn't find any relevant error in the splunkd log on the forwarder, but I did notice the two consecutive entries on the metrics.log:

1/25/19 4:55:01.800 PM 01-25-2019 16:55:01.800 +1100 INFO  Metrics - group=per_source_thruput, series="/XXX/systemerr.log", kbps=10.196221, eps=0.193555, kb=316.072266, ev=6, avg_age=1389.166667, max_age=1667 
1/25/19 4:22:59.801 PM 01-25-2019 16:22:59.801 +1100 INFO  Metrics - group=per_source_thruput, series="/XXX/systemerr.log", kbps=6.268667, eps=0.161285, kb=194.334961, ev=5, avg_age=211.600000, max_age=265

We got the false alert around 4:54, so if I understand correctly by looking at the time gap and the "avg_age" value, it might be possible that the alert was triggered because the data was only being read after 4:55; there was no update (new lines) on the file from 4:22 until 4:55.

So the question is, is my understanding correct? Is the problem caused by a delay in writing the data in the source log file, or is it because of a processing delay in Splunk itself?

Appreciate any advise,

Tags (1)
0 Karma



what could help you to understand delayed indexing, you could have a look at the delta between time and index time.

| eval delta=_indextime-_time

If the value for delta is high you might experience delay of log ingestion.

0 Karma


Thanks for that, I'm sure that will be useful but in my case I have suspect the _time value itself may not be useful as it's parsed from the logfile content, not the actual time when the logfile is being read by the forwarder. So even though the delta value shows delay about 15 minutes on the oldest set of record, it might be more because the log line itself was not available in the target log until 15m after the actual occurence.

I guess I should rephrase the question, based on the metrics.log output what does the time gap between the two entries, as shown on my original post, mean? In normal situation, there's rarely any gap between the lines for more than a minute. I can only guess it's because the new lines were being added after 4:55 in this case, but not entirely sure.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

March Community Office Hours Security Series Uncovered!

Hello Splunk Community! In March, Splunk Community Office Hours spotlighted our fabulous Splunk Threat ...

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...