Hello,
I'm trying to measure the time that data got ingested and the time it showed up on my search. I read that i have to use latency. i don't fully understand what it does, but I came up with this so far after i saw a couple examples:
| tstats count where index= * by _time _indextime index span=1ms
| eval latency=abs(_indextime-_time)
| stats sum min(latency) avg(latency) max(latency) by index
My question is, is this the correct way to measure the time stamp and time it showed up on search ? if not what is the right way?
I'm also trying to get the results in MS so i used span=1ms . Is this correct ? Sorry for asking a lot of questions and thank you in advance!
What exactly do you mean by "time it showed up on my search"?
I think your search more or less works, problem is, that span does not support subsecond accuracy, so if you need that, tstats isn't the way to go.
I've never done this using a tstats by _time and _indextime before though, so I might be missing some detailed nuances. Usually I run something like this. I typically am investigating a specific data feed (so a specific index) and want to see which hosts are having delays / timezone issues etc.
index= myIndex
| eval latency=abs(_indextime-_time)
| stats min(latency) avg(latency) max(latency) by host
Use below smart app - Meta Woot!
https://splunkbase.splunk.com/app/2949/
The Meta Woot! Splunk app from Discovered Intelligence provides superior levels of insight and intelligence from your Splunk metadata and now your Splunk license data too!
The app maintains a near real-time state table of host, sourcetype and index metadata. Meta Woot! is accurate at scale and allows users to instantly report on host, sourcetype and/or index together. The app includes summary based event count trending, correlation of event volumes against license and includes compliance reporting on both data latency and indexing.
Thank you, i'll check that out!
Googling for splunk latency definition and we get -
-- Latency is the difference between the time assigned to an event (usually parsed from the text) and the time it was written to the index. ... In most production Splunk instances, the latency is usually just a few seconds.
When we speak about data that is being streamed in constantly, the time assigned to an event is practically the system time and therefore _indextime - _time
measures the latency.
What exactly do you mean by "time it showed up on my search"?
I think your search more or less works, problem is, that span does not support subsecond accuracy, so if you need that, tstats isn't the way to go.
I've never done this using a tstats by _time and _indextime before though, so I might be missing some detailed nuances. Usually I run something like this. I typically am investigating a specific data feed (so a specific index) and want to see which hosts are having delays / timezone issues etc.
index= myIndex
| eval latency=abs(_indextime-_time)
| stats min(latency) avg(latency) max(latency) by host
I'm trying to find the time when the data got ingested, stored, and available for analysis in search query. I want to see how long this process takes in ms or seconds and view the results on a dashboard. However, i just have't found an effiecent way to do that yet. So i was trying what i posted but not sure if thta's the right approch or not. Please help,Thank you!
That is the right approach to know the latency .
Maybe i've asked this before but just to make sure, is latency tells you when was the data ingested and when did it show up on the search query ? or does it mean something else ? Beacuse i'm trying to see how log did it take for the data to be ingetsed and shows up on my Splunk query search. Thank you!
No it tells when the logs were ingested (_indextime) versus when the logs were forwarded from the source system if the timestamp(_time) is configured correctly from the logs .
Indeed, what you're measuring with that search you shared is the delay between when the event took place on the source (_time) and when the event was stored in Splunk (_indextime).
Once it is indexed it is available for search, unless something is seriously wrong with your setup or you are worried about microsecond real-timeness I wouldn't really see why you'd want to analyse the delay between indextime and when the data is available in searchresults (if it would even be possible to have such a delay).
Thank You!
Thank You!