Hello, is there a way to detect gaps in data by some id?
As well as check if the gap is greater than 4 hours, then show the last event and the one before him?
Hi @sarit_s,
Yes you can !
Here's an easy command that you can apply on your data with 900 sec gap:
... | streamstats current=f last(_time) as last_time by host | eval gap = last_time - _time | where gap > 900 | convert ctime(last_time) as last_time | table _time last_time host
from here :
https://answers.splunk.com/answers/69728/search-to-find-gaps-in-data.html
Cheers,
David
dedup at first will only keep the last 2 events of every "source" field and timeDeltaS returns the time gap between the last one and current one. The numbers you see 0,1,2 are the gaps in second and the | where timeDeltaS > 14400 should be applied in the end to keep only values larger than 14400 seconds since now you're keeping all deltas.
im not sure i understood..
if i see some SerialNumber and timeDeltaS 1 then i have gap of one second between two events ?
how can i know which event it is ?
can i add the source ?
i added to the stats values(source) as source but it is giving me too many files there
Its always the gap between the last two events, so its the time since the last time this serial number appeared. You can also keep the source field if needed since its the same as ur serial number field anw ![]()
there is no way to find all the gaps withing this Serial number ?
it is possible that i will have more than one gap for each one of them
ouch... yeah in that case you'll have to get rid of the dedup and you're back where we started:
index=ssys_*_pj
| streamstats current=f last(_time) as last_time by source
| eval timeDeltaS = last_time - _time
| eval SerialNumber = mvindex(split(source,"/") ,7)
| stats last(timeDeltaS) as timeDeltaS by SerialNumber
From there the best way to optimize is to reduce the number of results in index=ssys_*_pj by either limiting the time frame or specifying exactly which events you're looking for
thanks !
you're most welcome
please upvote and accept if it was helpful !
Hi @DavidHourani
can we back for a sec to my question ?
is it possible to show the source of _time and the source of _last_time ?
i tried to add source to streamstats but i get only the last source
yeah, you could add values(_time) to the last stats command, this should add the time field for you, not sure what values you'll get though, could you have a look and let me know ?
im talking about that i want to see the source files..
also, i found out that this search :
index=ssys_*_pj sourcetype=*log*
| `SerialNumber`
| streamstats current=f last(_time) as last_time values(source) as source by SerialNumber | eval gap = last_time - _time | where gap > 14400 | convert timeformat="%Y-%m-%d %H:%M:%S" ctime(last_time) as last_time | table _time last_time SerialNumber source
working better for me.. its true that it is much slower but gives me the results in more user friendly way..
maybe it is possible to make this one run faster ?
any way, the important thing is to have the source ![]()
This could be a bit faster, the best would be to reduce the size of the data set from the start and then apply everything from there :
index=ssys_*_pj sourcetype=*log*
| streamstats current=f last(_time) as last_time by source
| eval gap = last_time - _time
| where gap > 14400
| eval SerialNumber = mvindex(split(source,"/") ,7)
| stats last(gap) as gap by SerialNumber, source,last_time,_time
| convert timeformat="%Y-%m-%d %H:%M:%S" ctime(last_time) as last_time
this one is faster but returns no result
fixed, try now ![]()
i have result now
it is slow again ![]()
but i still see only one source..
is it possible to see both source files ?
i want to see the source file of where the gap starts and where it was ended
i also see that this query :
index=ssys_*_pj sourcetype=*log*
| `SerialNumber`
| streamstats current=f last(_time) as last_time values(source) as source by SerialNumber | eval gap = last_time - _time | where gap > 14400 | convert timeformat="%Y-%m-%d %H:%M:%S" ctime(last_time) as last_time | table _time last_time SerialNumber source
returns more results that the new one
so.. any idea how to have the source path for each time ?
hey @DavidHourani
i found out that tstats can help me make the query faster but i think that i'm missing something since im not getting results.
this is the query:
| tstats values WHERE (index=ssys__pj AND sourcetype=*log) by source
| streamstats current=f last(_time) as last_time values(source) as source last(source) as last_source by SerialNumber | eval gap = last_time - _time | where gap > 14400 | convert timeformat="%Y-%m-%d %H:%M:%S" ctime(last_time) as last_time | table SerialNumber _time last_time last_source source
also, i added "last_source" but the results returns are the same as "source"
any suggestion ?
Ah yes, tstats is great, and to make it even faster use this option : summariesonly=true this way it runs on accelerated data only.
Now back to the query, seems like you were missing the |SerialNumberpart after thetstats` so it should be like this :
| tstats values WHERE (index=ssys_*_pj AND sourcetype=log) by source
| `SerialNumber`
| streamstats current=f last(_time) as last_time values(source) as source last(source) as last_source by SerialNumber
| eval gap = last_time - _time
| where gap > 14400
| convert timeformat="%Y-%m-%d %H:%M:%S" ctime(last_time) as last_time
| table SerialNumber _time last_time last_source source
If it still doesn't work, I advise you to go through the search part by part to check if its working well, start with the first part, then add the next, etc, till you reach to part that doesn't give results.
Hi @DavidHourani
thanks !
summary is good but since i didn't collect summary yet, can i still use this option ?
I didn't add
|
SerialNumber
since i added this field as calculated field so i won't need macro anymore
i have a problem that when im running this query with
tstats values
im getting this error :
Job terminated unexpectedly
when im running it with count its stopped returning results when running this part :
where gap > 14400
i will give an example
i have logs from 07/05/19 15:23:57 for some SerialNumber
and the next time i have log is 08/05/19 09:19:02
i want to catch such stuff
is it possible ?