Splunk Search

How do I find the first and last time a message is seen in a repetitive sequence of events?

rogercruz
Engager

I would like to create a table that displays the first and last event from a duplicate set of events.  A duplicate run may be interrupted by a non-duplicate event in which case I also want to the display the first and last message when the duplicate events appear again.

For example, consider this list of events

 

| makeresults count=10
| streamstats count
| eval _time=1599978591-(count*60)
| eval Message = case(count=10, "MessageA", count=9, "MessageA", count=8, "MessageA", count=7, "MessageB", count=6, "MessageB", count=5, "MessageB", count=4, "MessageA", count=3, "MessageA", count=2, "MessageA", count=1, "MessageB")
| reverse
| table _time, Message

 

which will create a table similar to:

_timeMessage
2020-09-13 06:19:51MessageA
2020-09-13 06:20:51MessageA
2020-09-13 06:21:51MessageA
2020-09-13 06:22:51MessageB
2020-09-13 06:23:51MessageB
2020-09-13 06:24:51MessageB
2020-09-13 06:25:51MessageA
2020-09-13 06:26:51MessageA
2020-09-13 06:27:51MessageA
2020-09-13 06:28:51MessageB

 

Now, I would like to display the first and last time a message is seen, removing any duplicates.. but it needs to consider that the same message may be seen again in another sequence and should be considered a different run to be displayed.

I thought of adding this

 

| stats earliest(_time) as Earliest, latest(_time) as Latest BY Message 
| eval FirstEvent=strftime(Earliest,"%+"), LastEvent=strftime(Latest,"%+") 
| table  FirstEvent, LastEvent, Message

 

but it doesn't take into account that there may be multiple sequences of repeated events so it generates

FirstEventLastEventMessage
Sun Sep 13 06:19:51 UTC 2020Sun Sep 13 06:19:51 UTC 2020MessageA
Sun Sep 13 06:22:51 UTC 2020Sun Sep 13 06:22:51 UTC 2020MessageB


But what I desire is the following output

FirstEventLastEventMessage
Sun Sep 13 06:19:51 UTC 2020Sun Sep 13 06:21:51 UTC 2020MessageA
Sun Sep 13 06:22:51 UTC 2020Sun Sep 13 06:24:51 UTC 2020MessageB
Sun Sep 13 06:25:51 UTC 2020Sun Sep 13 06:27:51 UTC 2020MessageA
Sun Sep 13 06:28:51 UTC 2020Sun Sep 13 06:28:51 UTC 2020MessageB


Any help is extremely appreciated.  Thanks in advance.

Roger Cruz

Labels (2)
Tags (3)
0 Karma

thambisetty
Super Champion

you are not only the one who struggles with streamstas.

| makeresults count=10
| streamstats count
| eval _time=1599978591-(count*60)
| eval Message = case(count=10, "MessageA", count=9, "MessageA", count=8, "MessageA", count=7, "MessageB", count=6, "MessageB", count=5, "MessageB", count=4, "MessageA", count=3, "MessageA", count=2, "MessageA", count=1, "MessageB")
| reverse
| table _time, Message
| streamstats last(Message) as prev_Message by Message current=false global=false reset_on_change=true # finding first event of Message with the help of current=false - this will ignore first match in sequence ( for example MessageA, MessageA, MessageA) and reset_on_change=true - This will reset function it can be last , count, or any aggregated function when it encounters different value for a field given in by clause
| eval prev_Message=if(isnull(prev_Message),"first",null())
| sort 0 - _time # reversing events so last event will be first after this command execution 
| streamstats last(Message) as prev_Message_1 by Message current=false global=false reset_on_change=true # performing same streamstats this time it will last event
| sort 0 _time
| eval prev_Message_1=if(isnull(prev_Message_1) AND (isnull(prev_Message) OR prev_Message=="first"),"last",null())
| where isnotnull(prev_Message) OR isnotnull(prev_Message_1)
| eval when=coalesce(prev_Message,prev_Message_1)
| fields - prev*
| xyseries Message when _time grouped=true
| convert ctime(first) ctime(last)
| eval last=if(isnull(last),first,last)

check some more info at below link:

https://www.youtube.com/watch?v=xzgFTqSFx7w

————————————
If this helps, give a like below.
0 Karma

thambisetty
Super Champion

@rogercruz 

little heavy, you can give a try

| makeresults count=10
| streamstats count
| eval _time=1599978591-(count*60)
| eval Message = case(count=10, "MessageA", count=9, "MessageA", count=8, "MessageA", count=7, "MessageB", count=6, "MessageB", count=5, "MessageB", count=4, "MessageA", count=3, "MessageA", count=2, "MessageA", count=1, "MessageB")
| reverse
| table _time, Message
| streamstats last(Message) as prev_Message by Message current=false global=false reset_on_change=true
| eval prev_Message=if(isnull(prev_Message),"first",null())
| sort 0 - _time
| streamstats last(Message) as prev_Message_1 by Message current=false global=false reset_on_change=true
| sort 0 _time
| eval prev_Message_1=if(isnull(prev_Message_1) AND (isnull(prev_Message) OR prev_Message=="first"),"last",null())
| where isnotnull(prev_Message) OR isnotnull(prev_Message_1)
| eval when=coalesce(prev_Message,prev_Message_1)
| fields - prev*
| xyseries Message when _time grouped=true
| convert ctime(first) ctime(last)
| eval last=if(isnull(last),first,last)
————————————
If this helps, give a like below.

rogercruz
Engager

@thambisetty 

I must admit that I need a Splunk PhD to understand that query but it seems to work!  I'm glad my confusing request was understood.   Thank you so much!

Could you explain the logic behind this query in layman's terms?

I tried working with dedup and stats but nothing I came up with could deal with the fact that a repeating sequence could be interrupted and I needed to then reset the start + end.  My Google search magic also failed me in finding any other person asking the same.   Am I the first one to ever find to find out the first + last events of repeating sequences?

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!