Re: How do I find the first and last time a messag...

rogercruz · ‎09-13-2020

I would like to create a table that displays the first and last event from a duplicate set of events. A duplicate run may be interrupted by a non-duplicate event in which case I also want to the display the first and last message when the duplicate events appear again.

For example, consider this list of events

| makeresults count=10
| streamstats count
| eval _time=1599978591-(count*60)
| eval Message = case(count=10, "MessageA", count=9, "MessageA", count=8, "MessageA", count=7, "MessageB", count=6, "MessageB", count=5, "MessageB", count=4, "MessageA", count=3, "MessageA", count=2, "MessageA", count=1, "MessageB")
| reverse
| table _time, Message

which will create a table similar to:

_time	Message
2020-09-13 06:19:51	MessageA
2020-09-13 06:20:51	MessageA
2020-09-13 06:21:51	MessageA
2020-09-13 06:22:51	MessageB
2020-09-13 06:23:51	MessageB
2020-09-13 06:24:51	MessageB
2020-09-13 06:25:51	MessageA
2020-09-13 06:26:51	MessageA
2020-09-13 06:27:51	MessageA
2020-09-13 06:28:51	MessageB

Now, I would like to display the first and last time a message is seen, removing any duplicates.. but it needs to consider that the same message may be seen again in another sequence and should be considered a different run to be displayed.

I thought of adding this

| stats earliest(_time) as Earliest, latest(_time) as Latest BY Message 
| eval FirstEvent=strftime(Earliest,"%+"), LastEvent=strftime(Latest,"%+") 
| table  FirstEvent, LastEvent, Message

but it doesn't take into account that there may be multiple sequences of repeated events so it generates

FirstEvent	LastEvent	Message
Sun Sep 13 06:19:51 UTC 2020	Sun Sep 13 06:19:51 UTC 2020	MessageA
Sun Sep 13 06:22:51 UTC 2020	Sun Sep 13 06:22:51 UTC 2020	MessageB

But what I desire is the following output

FirstEvent	LastEvent	Message
Sun Sep 13 06:19:51 UTC 2020	Sun Sep 13 06:21:51 UTC 2020	MessageA
Sun Sep 13 06:22:51 UTC 2020	Sun Sep 13 06:24:51 UTC 2020	MessageB
Sun Sep 13 06:25:51 UTC 2020	Sun Sep 13 06:27:51 UTC 2020	MessageA
Sun Sep 13 06:28:51 UTC 2020	Sun Sep 13 06:28:51 UTC 2020	MessageB

Any help is extremely appreciated. Thanks in advance.

Roger Cruz

thambisetty · ‎09-13-2020

you are not only the one who struggles with streamstas.

| makeresults count=10
| streamstats count
| eval _time=1599978591-(count*60)
| eval Message = case(count=10, "MessageA", count=9, "MessageA", count=8, "MessageA", count=7, "MessageB", count=6, "MessageB", count=5, "MessageB", count=4, "MessageA", count=3, "MessageA", count=2, "MessageA", count=1, "MessageB")
| reverse
| table _time, Message
| streamstats last(Message) as prev_Message by Message current=false global=false reset_on_change=true # finding first event of Message with the help of current=false - this will ignore first match in sequence ( for example MessageA, MessageA, MessageA) and reset_on_change=true - This will reset function it can be last , count, or any aggregated function when it encounters different value for a field given in by clause
| eval prev_Message=if(isnull(prev_Message),"first",null())
| sort 0 - _time # reversing events so last event will be first after this command execution 
| streamstats last(Message) as prev_Message_1 by Message current=false global=false reset_on_change=true # performing same streamstats this time it will last event
| sort 0 _time
| eval prev_Message_1=if(isnull(prev_Message_1) AND (isnull(prev_Message) OR prev_Message=="first"),"last",null())
| where isnotnull(prev_Message) OR isnotnull(prev_Message_1)
| eval when=coalesce(prev_Message,prev_Message_1)
| fields - prev*
| xyseries Message when _time grouped=true
| convert ctime(first) ctime(last)
| eval last=if(isnull(last),first,last)

check some more info at below link:

https://www.youtube.com/watch?v=xzgFTqSFx7w

————————————
If this helps, give a like below.

thambisetty · ‎09-13-2020

@rogercruz

little heavy, you can give a try

| makeresults count=10
| streamstats count
| eval _time=1599978591-(count*60)
| eval Message = case(count=10, "MessageA", count=9, "MessageA", count=8, "MessageA", count=7, "MessageB", count=6, "MessageB", count=5, "MessageB", count=4, "MessageA", count=3, "MessageA", count=2, "MessageA", count=1, "MessageB")
| reverse
| table _time, Message
| streamstats last(Message) as prev_Message by Message current=false global=false reset_on_change=true
| eval prev_Message=if(isnull(prev_Message),"first",null())
| sort 0 - _time
| streamstats last(Message) as prev_Message_1 by Message current=false global=false reset_on_change=true
| sort 0 _time
| eval prev_Message_1=if(isnull(prev_Message_1) AND (isnull(prev_Message) OR prev_Message=="first"),"last",null())
| where isnotnull(prev_Message) OR isnotnull(prev_Message_1)
| eval when=coalesce(prev_Message,prev_Message_1)
| fields - prev*
| xyseries Message when _time grouped=true
| convert ctime(first) ctime(last)
| eval last=if(isnull(last),first,last)

————————————
If this helps, give a like below.

rogercruz · ‎09-13-2020

@thambisetty

I must admit that I need a Splunk PhD to understand that query but it seems to work! I'm glad my confusing request was understood. Thank you so much!

Could you explain the logic behind this query in layman's terms?

I tried working with dedup and stats but nothing I came up with could deal with the fact that a repeating sequence could be interrupted and I needed to then reset the start + end. My Google search magic also failed me in finding any other person asking the same. Am I the first one to ever find to find out the first + last events of repeating sequences?

How do I find the first and last time a message is seen in a repetitive sequence of events?

stats

Can’t make it to .conf25? Join us online!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Unlock What’s Next: The Splunk Cloud Platform at .conf25

Are you a member of the Splunk Community?