Re: help to calculate _indextime

jip31 · ‎07-16-2024

hi

is anybody could give me a search to calculate the _indextime average for my events

once it's done, what i have to do in the cron parameters of my alert to take into account this metric?

thanks

ITWhisperer · ‎07-16-2024

What does that even mean? _indextime is not calculated, it is the time when the event was indexed. It is like saying what is the average hour of the day?

Earliest and latest relate to _time not _indextime. Usually, _indextime is after _time as it takes time for the event to be logged, transmitted, parsed and indexed. Having said that, _time usually comes from the data in the event, which could be in the future as far as the event is concerned.

Please explain what your goal is in more detail.

jip31 · ‎07-16-2024

i just want to calculate the latency between _indextime and _time in order to define a time range in my alert which take into account this latency

ITWhisperer · ‎07-18-2024

Another thing to bear in mind is what the _time (stamp) means. If it is interpreted from the data in the event, then it is the time that the application has chosen to put into the data.

For example, with Apache HTTPD logs (and other logs), the timestamp is when the request was received, but the logged event is written when the response was sent back, so is already lagging by whatever the response time of the request was.

12:01:00 - request received by Apache (_time)
12:01:10 - response received by Apache
12:01:11 - event logged by Apache (request time and duration time of 10 seconds)
12:01:14 - event indexed by Splunk (_indextime)

As you can see in this example, the difference between _time and _indextime is 14 seconds, but the lag between when the event was written and when it was indexed is only 3 seconds.

So, unless the _time value is the time (or as close as possible to the time) that the application wrote the event (so it was available to the forwarders to send to Splunk), calculating the difference between _time and _indextime can represent a number of factors and you need to understand what the values represent to determine whether they are of any value.

Having said that, comparing the difference with historic differences may at least give you an insight as to whether there is any degradation/variation, which might be worth investigating.

PickleRick · ‎07-18-2024

As I wrote several times before - _indextime is a field which _can_ be used to troubleshoot your ingestion process _if_ you know your data pipeline and data characteristics - if you know if your event time is reliable, if you have properly configured timestamp extraction, if you know the latency between the event itself and the time the source emits the event (the apache example is a great one here).

ITWhisperer · ‎07-17-2024

You can calculate the latency like this

| eval latency=_indextime - _time

However, this is for the events already in the event pipeline. You could use this to find a maximum latency over a period and apply this statically to your earliest value in your next search. However, this is still only a static value and there is no guarantee that you won't have missed some events with higher latencies.

You could periodically rerun the latency calculator to see if you are missing any events and adjust your search accordingly.

jip31 · ‎07-17-2024

ok is the latency explain in seconds?

Imagine the latency is 180

does it mean i have to put -3m@m in earliest and now() in latest?

PickleRick · ‎07-18-2024

Since _time and _indextime are expressed in seconds their difference will be in seconds as well.

But to make things more complicated, while for many sources it's desired state to have low latency, there can be cases where significant latency is normal (especially if events are ingested in batches).

VatsalJagani · ‎07-18-2024

No.

Let me explain with example.

Event with timestamp 3:14 comes at 3:17
Event with timestamp 3:15 comes at 3:18
That means if you make a search at 3:18 with earliest=-2m@m latest=now
- This means it will search event between 3:16 to 3:18
- Locally this will not include any events, never. Because events are always 3 minutes delay.

Solution is to never search last 3 minutes when writing search and wait for those events to populate in the next schedule with:

earliest=-63m@m latest=-3m@m
OR
earliest=13m@m latest=-3m@m
OR
earliest can be anything but keep the latest in a way that it never search recent events, to keep events being missed issue.

There is another solution as well with _index_earliest and _index_latest, but that's topic for another time. (a bit complicated)

I hope this helps!!

help to calculate _indextime

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Are you a member of the Splunk Community?

help to calculate _indextime

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25