Hi, please consider this example.csv:
device result _time
apollo1 passed 2019-06-28
apollo1 failed 2019-06-28
zeus8 passed 2019-06-23
zeus8 failed 2019-06-23
zeus8 error 2019-06-23
hermes3 failed 2019-06-23 //Edit #1
hermes3 failed 2019-06-23 //Edit #1
apollo1 passed 2019-06-20 //Edit #1
Note: This is just an example. My real CSV has thousands of devices.
I want to return the latest entry based on the device
value. When a device appears more than once with a duplicated timestamp and different result
value, the event where result = passed should appear. Else, show the event where result is failed or error. What query do I need to do that?
My current query ( |inputlookup example.csv |stats latest by device
) does not guarantee the event where result = passed will appear when a device shows up multiple times with with the same latest timestamp and different result
values.
//Edit #1 below
The query I need would produce the table below:
device result _time
apollo1 passed 2019-06-28
zeus8 passed 2019-06-23
hermes3 failed 2019-06-23
The query would 1) show only the latest device with the latest _time
, 2) if a device is listed twice with the same _time
and a different result
, show just the device where result = passed. Else display the event with either failed or error, and 3) since conditions 1 and 2 are followed, each device should only appear once.
Like this:
| makeresults | eval raw="device=apollo1,result=passed,_time=2019-06-28 device=apollo1,result=failed,_time=2019-06-28 device=zeus8,result=passed,_time=2019-06-23 device=zeus8,result=failed,_time=2019-06-23 device=zeus8,result=error,_time=2019-06-23 device=hermes3,result=failed,_time=2019-06-23 device=hermes3,result=failed,_time=2019-06-23 device=apollo1,result=passed,_time=2019-06-20"
| makemv raw
| mvexpand raw
| rename raw AS _raw
| kv
| eval _time = strptime(time, "%Y-%m-%d")
| fields - _raw time
| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"
| eval sortme = if(result="passed", 0, 1)
| sort 0 _time device sortme
| fields - sortme
| dedup _time device
Try this:
|inputlookup example.csv
| sort - device _time result
| dedup device
Note: this does assume error,failed and passed are the only options, where passed conveniently happens to be the furthest down in alphabetic order.
This solution does not prefer result=passed
over all other values in a time-host tie.
Why not? result is taken along in the sort, so for any items with same device and _time, passed would end up on top.
Read what OP said. This is the key detail on his request.
Yes, and unless I'm completely missing something, my solution does just that. It will sort by device, within that by time and within that by result (all decending) and then takes the top row per device using dedup. Sorting result decending means if there was a "passed" on the last date, it will show on top.
He is clearly implying that the passed
event might not be on top. Otherwise the solution is trivial.
That is why I'm including a sort.
He says also that they have the "same time".
Yes, again: that is why I include the result field into the sorting. Since passed comes after error and fail in the alphabet, if there are multiple entries with the same time for some device, the pass entry will end up on top.
Ah, you are assuming that no states are farther down the alphabet than p
.
Exactly, as clearly mentioned in my answer 😉
@russell120- Try below
|inputlookup example.csv |sort - _time device |streamstats count as i by device| dedup device
@Vijeta Hi, that does not show the latest event that prioritizes where result
= passed
when a device has 2 (or more) events with the same timestamp and different result
value. Please reference this condition from the original question:
When a device appears more than once with a duplicated timestamp and different result value, the event where result = passed should appear. Else, show the event where result is failed or error. What query do I need to do that?
@russell120 Sorry for misinterpreting the question , can you try like below-
|inputlookup example.csv|stats count(eval(result="passed")) as PS, count(eval(result!="passed")) as NP , values(result) as result by _time device|eval result=if(PS>=1 and NP>0,"passed",result)| fields _time device result
It's okay. I edited my initial post and added "//Edit #1" to where I updated it to more accurately reflect my real CSV. What you most recently posted almost accomplished the solution but there are duplicated devices from previous dates.
Hi @russell120 you can use stats latest at the end of above search like this-
|inputlookup example.csv|stats count(eval(result="passed")) as PS, count(eval(result!="passed")) as NP , values(result) as result by _time device|eval result=if(PS>=1 and NP>0,"passed",result)| fields _time device result| stats latest by device