Splunk Dev

help to calculate _indextime

jip31
Motivator

hi

is anybody could give me a search to calculate the _indextime average for my events

once it's done, what i have to do in the cron parameters of my alert to take into account this metric?

thanks

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

What does that even mean? _indextime is not calculated, it is the time when the event was indexed. It is like saying what is the average hour of the day?

Earliest and latest relate to _time not _indextime. Usually, _indextime is after _time as it takes time for the event to be logged, transmitted, parsed and indexed. Having said that, _time usually comes from the data in the event, which could be in the future as far as the event is concerned.

Please explain what your goal is in more detail.

0 Karma

jip31
Motivator

i just want to calculate the latency between _indextime and _time in order to define a time range in my alert which take into account this latency

jip31_0-1721199371737.png

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Another thing to bear in mind is what the _time (stamp) means. If it is interpreted from the data in the event, then it is the time that the application has chosen to put into the data.

For example, with Apache HTTPD logs (and other logs), the timestamp is when the request was received, but the logged event is written when the response was sent back, so is already lagging by whatever the response time of the request was.

12:01:00 - request received by Apache (_time)
12:01:10 - response received by Apache
12:01:11 - event logged by Apache (request time and duration time of 10 seconds)
12:01:14 - event indexed by Splunk (_indextime)

As you can see in this example, the difference between _time and _indextime is 14 seconds, but the lag between when the event was written and when it was indexed is only 3 seconds.

So, unless the _time value is the time (or as close as possible to the time) that the application wrote the event (so it was available to the forwarders to send to Splunk), calculating the difference between _time and _indextime can represent a number of factors and you need to understand what the values represent to determine whether they are of any value.

Having said that, comparing the difference with historic differences may at least give you an insight as to whether there is any degradation/variation, which might be worth investigating.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I wrote several times before - _indextime is a field which _can_ be used to troubleshoot your ingestion process _if_ you know your data pipeline and data characteristics - if you know if your event time is reliable, if you have properly configured timestamp extraction, if you know the latency between the event itself and the time the source emits the event (the apache example is a great one here).

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You can calculate the latency like this

| eval latency=_indextime - _time

However, this is for the events already in the event pipeline. You could use this to find a maximum latency over a period and apply this statically to your earliest value in your next search. However, this is still only a static value and there is no guarantee that you won't have missed some events with higher latencies.

You could periodically rerun the latency calculator to see if you are missing any events and adjust your search accordingly.

0 Karma

jip31
Motivator

ok is the latency explain in seconds?

Imagine the latency is 180

does it mean i have to put -3m@m in earliest and now() in latest?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Since _time and _indextime are expressed in seconds their difference will be in seconds as well.

But to make things more complicated, while for many sources it's desired state to have low latency, there can be cases where significant latency is normal (especially if events are ingested in batches).

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

No.

 

Let me explain with example.

  • Event with timestamp 3:14 comes at 3:17
  • Event with timestamp 3:15 comes at 3:18
  • That means if you make a search at 3:18 with earliest=-2m@m latest=now
    • This means it will search event between 3:16 to 3:18
    • Locally this will not include any events, never. Because events are always 3 minutes delay.

Solution is to never search last 3 minutes when writing search and wait for those events to populate in the next schedule with:

  • earliest=-63m@m latest=-3m@m
  • OR
  • earliest=13m@m latest=-3m@m
  • OR
  • earliest can be anything but keep the latest in a way that it never search recent events, to keep events being missed issue.

 

There is another solution as well with _index_earliest and _index_latest, but that's topic for another time. (a bit complicated)

 

I hope this helps!!

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...