The way I read your premise, this sounds like a transaction logic. So, let me first clarify your use case.
You data look like
_time | id | message | state |
1969-12-31 16:00:00 | 101 | executed | started |
1969-12-31 16:00:04 | 102 | activity printed | started |
1969-12-31 16:00:09 | 101 | null | in progress |
1969-12-31 16:00:10 | 102 | null | in progress |
1969-12-31 16:00:18 | 102 | none | completed |
1969-12-31 16:00:24 | 101 | none | completed |
Note I added some time interleave between 101 and 102 to make the transaction nature more obvious. (Never mind the date is from 1969; that is just for ease of emulation.) You want to use some results like
_time | duration | eventcount | id | message | state |
1969-12-31 16:00:04 | 14 | 3 | 102 | activity printed | completed<-in progress<-started |
1969-12-31 16:00:00 | 24 | 3 | 101 | executed | completed<-in progress<-started |
Here, I ignored the format of the expected output in your earlier comment, just want to clarify that "state" goes through "started", "in progress", and "completed" to form a transaction for each unique "id". Your material requirement is to obtain a single value for "message" that is NEITHER "null" nor "none". Is this correct? The result as illustrated here can be obtained with
| transaction id startswith="state=started" endswith="state=completed"
| eval message = mvfilter(NOT message IN ("none", "null"))
| eval state = mvjoin(state, "<-")
The first two commands literally implements my interpretation of your intentions. The third line is just a visual element to make state transition obvious for each .
In my mind, the above results table is sufficient, and is more representative of the problem. But if you really want to list each event, like
_time | id | message | state |
1969-12-31 16:00:00 | 101 | executed | started |
1969-12-31 16:00:04 | 102 | activity printed | started |
1969-12-31 16:00:09 | 101 | executed | in progress |
1969-12-31 16:00:10 | 102 | activity printed | in progress |
1969-12-31 16:00:18 | 102 | activity printed | completed |
1969-12-31 16:00:24 | 101 | executed | completed |
You can either use eventstats
| eventstats values(message) as message by id| eval message = mvfilter(NOT message IN ("none", "null"))
| eval message = mvfilter(NOT message IN ("none", "null"))
or streamstats as @bowesmana suggested
| streamstats values(message) as message by id| eval message = mvfilter(NOT message IN ("none", "null"))
| eval message = mvfilter(NOT message IN ("none", "null"))
To emulate input, I added _time into @bowesmana's formula because it's just simpler.
| makeresults format=csv data="id,message,state,_time
101,executed,started,0
102,activity printed,started,4
101,null,in progress,9
102,null,in progress,10
102,none,completed,18
101,none,completed,24"
| eval _raw = "doesn't matter" ``` mock field _raw is important for transaction ```
``` data mockup above ```
Is the logic that IFF there is a previous message=executed for ID X, then if state=completed, message should then be changed to 'executed' or should it always be executed if state=completed?
| eval message=if(state="completed", "executed", message)
will just change message toexecuted if state is completed.
If you ONLY want to change completed to executed if there is a previous "started", then it is important to understand your data a bit better, as ordering becomes significant - you have
for ID 101 - so I am guessing that those are not in the order of occurrence.
You would look at using streamstats, stats, eventstats or transaction to solve this - but can you give more about your existing search an data
@bowesmana thanks for your quick response,
the value of massage field is different as per ID as you shown below.
current data:
expected output:
Use streamstats. Here's an example - use the last 3 lines with your data
| makeresults format=csv data="ID,message,state
101,executed,started
101,null,in progress
101,none,completed
102,activity printed,started
102,null,in progress
102,activity printed,completed"
| eval needs_fill=if(message="executed" AND state="started", 1, 0)
| streamstats max(needs_fill) as needs_fill by ID
| eval message=if(needs_fill=1 AND state="completed", "executed", message)
| makeresults
| eval state="started"
| eval message="executed"
|eval id="101"
|append [| makeresults
| eval state="inprogess"
| eval message="null"
|eval id="101"]
|append [| makeresults
| eval state="completed"
| eval message="none"
|eval id="101"]
|append [| makeresults
| eval state="started"
| eval message="activity printed "
|eval id="102"]
|append [| makeresults
| eval state="inprogess"
| eval message="null"
|eval id="102"]
|append [| makeresults
| eval state="completed"
| eval message="none"
|eval id="102"]| eval needs_fill=if(message="executed" AND state="started", 1, 0)
| streamstats max(needs_fill) as needs_fill by ID
| eval message=if(needs_fill=1 AND state="completed", "executed", message)
its not working as expected, as mentioned value of massage field is vary per ID's only value of state field remains same for all ID's
Not sure why you are doing all those appends/makeresults - but look at your id field - the streamstats logic uses ID, not id - fields are case sensitive
yes corrected its only working for where message="executed" but not where message values are different for other ID's. please be noted that massage value could be anything for IDs and values of state field are same.
| makeresults format=csv data="ID,message,state
101,executed,started
101,null,in progress
101,none,completed
102,activity printed,started
102,null,in progress
102,none,completed"
| eval startedMessage=if(state=="started",message,null())
| eventstats values(startedMessage) as startedMessage by ID
| eval message=if(state=="completed", startedMessage, message)