Is there a Splunk search idiom that I can use to get all the events in a dataset whenever a particular field value A changes over time with respect to another field value B?
For example, if I have a dataset of users and their availability statuses, how can I get all the events (times) for when each user's status changes:
t1 Bob available
t2 Bob available
t3 Bob busy
t4 Bob busy
t5 Bob busy
t6 Bob available
Is there a search result that will just return the events at t1, t3, and t6?
Hello @twotimepad
You can do this with streamstats pretty easily. Here's a run anywhere example using the sample data you provided:
| makeresults count=1
| eval data="t1 Bob available
t2 Bob available
t3 Bob busy
t4 Bob busy
t5 Bob busy
t6 Bob available"
| rex max_match=0 field=data "(?<data>[^\n\e]+)"
| mvexpand data
| eval data=trim(data)
| rex field=data "(?<t>[^\s]+) (?<Name>[^\s]+) (?<Status>[^\e]+)"
| table t Name Status
| streamstats window=1 current=f global=f values(Status) as LastStatus by Name
| where Status!=LastStatus OR isnull(LastStatus)
The streamstats command will return the last status reported by Name and the where clause will limit your results so you don't see anything unless there is no last status or if the status has changed.
Before:
After:
If you don't like implementing it through streamstats, I have found another way. Check my answer on this post:
https://answers.splunk.com/answers/775204/data-comparison-between-fields.html#answer-776231
Like this:
| makeresults
| eval _raw="time who state
t1 Bob available
t2 Bob available
t3 Bob busy
t4 Bob busy
t5 Bob busy
t6 Bob available"
| multikv forceheader=1
| sort 0 - time
| rename COMMENT AS "Everything above generates sample events; everything below is your solution"
| streamstats current=f last(state) AS next_state count AS _serial
| streamstats count(eval(state!=next_state)) AS sessionID
| eventstats last(_serial) AS keeper BY sessionID
| where keeper==_serial
Hello @twotimepad
You can do this with streamstats pretty easily. Here's a run anywhere example using the sample data you provided:
| makeresults count=1
| eval data="t1 Bob available
t2 Bob available
t3 Bob busy
t4 Bob busy
t5 Bob busy
t6 Bob available"
| rex max_match=0 field=data "(?<data>[^\n\e]+)"
| mvexpand data
| eval data=trim(data)
| rex field=data "(?<t>[^\s]+) (?<Name>[^\s]+) (?<Status>[^\e]+)"
| table t Name Status
| streamstats window=1 current=f global=f values(Status) as LastStatus by Name
| where Status!=LastStatus OR isnull(LastStatus)
The streamstats command will return the last status reported by Name and the where clause will limit your results so you don't see anything unless there is no last status or if the status has changed.
Before:
After:
Thank you! Is there a reason to use values(Status)
instead of last(Status)
?
Reading the documentation of streamstats, it seems like last()
should get the most recent value to compare against?
In this use case, it honestly doesn't matter since window=1
has it looking at the last event that happened. There aren't multiple events for last to make a difference, there is only one event, which is the last one to stream in.