Hello guys,
does someone know, whether it is possible, to do a matching of search results with previous results of the same search?
I have a machine, that can enter different modes. Just for the example lets say, the machine can enter mode A, B or C.
I receive an heartbeat every few seconds of hundred of these machines, which leads to a very large dataset. But I am not interested in the heartbeat, I am interested in the transition of the modes.
Example:
Time | Machine_ID | Mode |
10:00:00 | 1 | A |
10:00:01 | 2 | C |
10:00:02 | 2 | C |
10:00:03 | 1 | B |
10:00:04 | 2 | B |
So what I am basically interested in here is the transition of machine 1 from mode A to B and of machine 2 from C to B.
In other words: I am searching for heartbeats, where the mode is different than the mode of the previous heartbeat of the same machine_ID.
At the end, my result would look something like this
_time | _time_old_Mode | machine_ID | new_mode | old_Mode |
10:00:03 | 10:00:00 | 1 | B | A |
10:00:04 | 10:00:02 | 2 | B | C |
I have tried subsearches, but I was not sucessful. The simplified search for getting the heartbeat is currently:
index="heartbeat" | rex field=_raw "......(?P<MODE>.......)"| fields _time ID MODE
Performance is not crucial, as it is planned to run this at night for a summary index.
Thanks in advance!
Best Regards
If I understand you correctly, you want to only detect points in time when there is a transition between states.
That's relatively easy but not obvious to someone not used to splunk (I struggled myself with similar problem lately).
The key to the solution is to modify your data that it's easily processable in a stream manner. In this case it means sorting the data.
So you get your initial
index="heartbeat" | rex field=_raw "......(?P<MODE>.......)"| fields _time ID MODE
and sort it so that each batch of events contains ordered events from a single host
| sort 0 ID _time
(I'm not sure if you shouldn't do a reverse sort by _time - you have to check it).
Now as you have the data sorted you can regress over time series
| autoregress ID as oldID | autoregress MODE as oldMODE
This way in each event you get the information about previous values of ID and MODE.
Now you can filter only to show data points for which the MODE value has changed compared to the previous event but the ID didn't (if the ID changes we're switching from events concerning one host to the other).
| where ID=oldID AND MODE!=oldMODE
To tidy up of course it's convenient to drop the oldID field since we don't need it anymore
| fields - oldID
If I understand you correctly, you want to only detect points in time when there is a transition between states.
That's relatively easy but not obvious to someone not used to splunk (I struggled myself with similar problem lately).
The key to the solution is to modify your data that it's easily processable in a stream manner. In this case it means sorting the data.
So you get your initial
index="heartbeat" | rex field=_raw "......(?P<MODE>.......)"| fields _time ID MODE
and sort it so that each batch of events contains ordered events from a single host
| sort 0 ID _time
(I'm not sure if you shouldn't do a reverse sort by _time - you have to check it).
Now as you have the data sorted you can regress over time series
| autoregress ID as oldID | autoregress MODE as oldMODE
This way in each event you get the information about previous values of ID and MODE.
Now you can filter only to show data points for which the MODE value has changed compared to the previous event but the ID didn't (if the ID changes we're switching from events concerning one host to the other).
| where ID=oldID AND MODE!=oldMODE
To tidy up of course it's convenient to drop the oldID field since we don't need it anymore
| fields - oldID
Thank you very much ! This solved my issue in a simple way 😀
See if this helps. It looks for more than one Mode for each Machine_ID and displays the last Time field as the time of the change.
| makeresults | eval _raw="Time Machine_ID Mode
10:00:00 1 A
10:00:01 2 C
10:00:02 2 C
10:00:03 1 B
10:00:04 2 B" | multikv forceheader=1 | fields - _* linecount
```Everything above just sets up test data```
```Convert Time to integer```
| eval eTime=strptime(Time,"%H:%M:%S")
```Get the list of Modes and last Time for each Machine_ID```
| stats list(Mode) as Modes max(eTime) as ChangeTime by Machine_ID
```Remove duplicate Mode values```
| eval Modes=mvdedup(Modes)
```Filter out Machine_IDs with no Mode change```
| where mvcount(Modes) > 1
```Display ChangeTime in readable form```
| fieldformat ChangeTime=strftime(ChangeTime,"%H:%M:%S")
If you have _time rather than Time, use that instead and skip the strptime and strftime calls.
Thanks a lot for the reply! It goes in the right direction. Sometimes though, I get one change time as a result, but 3-4 modes. It looks like this:
ID | Modes | ChangeTime |
123 | A B | 1630616559.136 |
Do you have an idea, how I prevent the search to get me more than one mode as the last result?
PS: A machine can have several mode changes within short time, and I would need to get every mode change seperated.
My search looks like this currently:
.... | fields _time ID MODE | stats list(MODE) as Modes max(_time) as ChangeTime by ID | eval Modes=mvdedup(Modes)| where mvcount(Modes) > 1
Is it correct to interpret, that the first result in Modes is the new Mode? (in the example it would be A) Or is it the last one? (in the example it would be B)
Thanks and best regards!
Hi
I think that this is doable with simple query based on that mode information. Can you give some sample events where we can see how this can do?
r. Ismo