Join event to the closest in time matching event

jbdumoulin · ‎07-26-2018

Hello splunkers.

I have the following case, I need to retrieve an event that happens periodically. Then after getting that event (let's call it X) I need to check if an another event (Y) happened in a 20 minutes timeframe relatively to X happening.

I tried doing it with a subsearch like this

However, if I do this, my join will only return the last < event Y > found, and so if there are multiple Y events, that could match with some X events, this won't work. For instance if I have the following events :

Event Y _time
1 11 : 10 : 05
2 11 : 35 : 30
3 11 : 47 : 22

Event X _time
4 11 : 12 : 25
5 11 : 32 : 56
6 11 : 52 : 02

I'd like to exclude event 4 (happened 2 minutes after event 1), event 6 (happened 5 minutes after event 3) but not event 5 (happened 22 minutes after event 1 and before event 2).

Yet, with my current search, only event 6 will be excluded, because event 4 and 5 are compared to the time for event 3. Only the last event in the subsearch is compared to the main search. In an ideal world I'd like to have each event be compared to each time and if one match then exclude it, or at least each event being compared to the closest event and not to the last one !

I also tried to use append and selfjoin to create a "one to many" relationship with the subsearch instead of the "one to one" from the join, but the results are pure non sense when I use this.

But by doing so, without any filter on the _time, I end up with event 1, 2, 3 and 4, but not 5 and 6 with no real reason, they are joined on the same field. They all have the same hostname in this specific case.

I thought about transactions to do the trick, but I need to specify precise boundaries. If event Y happens after X, then it should be kept, while if it happens before AND during a 20 minutes tilmespan it can be excluded safely. I think transactions can't do that, at least I didn't find parameters in the documentation to give a start time and end time for the transaction.

Thanks for your help.

BonMot · ‎09-06-2018

Part of the problem is that sub-searches, joins, appends, etc. all run once per search. What you really want to do is runs a search per EVENT

Try this:

< find event X > 
| eval start=_time, end=relative_time(_time,"+20m") 
| table start end host <any other fields you want to reference in map search>
|  map 
    search="search host=$host$ earliest=$start$ latest=$end$ <search for Y>  " 
| stats count  AS "Count Event Y" by host

In this search your X search is just collecting variables for the map search. The map search then replaces the source event with all of the matches of the map search, unique to each source event. This means each source event gets its own timerange for the map search and can return more than one row. If you want run stats against each individual map search instead of at the end, you can add pipes in the the search term and then you only get one row back per X event.

As you can imagine, map searches are expensive so you want your source search to be as small as possible and your map search to be as efficient as possible

Join event to the closest in time matching event

Index This | When is October more than just the tenth month?

Observe and Secure All Apps with Splunk

What’s New & Next in Splunk SOAR

Are you a member of the Splunk Community?

Join event to the closest in time matching event

Index This | When is October more than just the tenth month?

Observe and Secure All Apps with Splunk

What’s New & Next in Splunk SOAR