Splunk Search

How to compare events with "milestone" lookup?

yuanliu
SplunkTrust
SplunkTrust

I have a really simple task but haven't figured out how.  This is a simple table of milestones

milestone1milestone2milestone3release
2022-01-302022-02-282022-03-251_0
2022-04-202022-05-102022-05-251_1
2022-07-022022-07-212022-08-141_2
2022-09-202022-10-142022-11-031_3
2022-12-212023-01-112023-01-312_0

I need to determine the "release" cycle a given event is in, and perform some calculations in relationship to milestones.  For illustration purposes, let's say if an event is in between milestone1 of 1_1 and 1_2 (2022-04-20 and 2022-07-02), I'll say it belongs in release cycle 1_1.  In other words, milestone2, milestone3, etc., can be considered mere attributes that I need to retrieve. (In the real world, some columns are not dates.)

Initially I thought a simple lookup would suffice.  But after various trials, I have made little progress.  If I could devise a macro based on the lookup table to output the value of release, I can certainly then lookup in the table to obtain the rest of attributes.  I even thought of adding a dummy (constant value) column so I could retrieve the entire table with every event.  But even with that, I still couldn't find an easy way to match event with row.

The best I have come up with so far is to determine the current release by comparing | inputlookup with now(), like this

 

| inputlookup release
| where now() > strptime(milestone1, "%F")
| eventstats max(release) as current_release
| where release == current_release

 

If milestone1 is in epoc time, the search could be simpler but in any case, this only gives now, and I cannot really use it in a macro unless the macro is placed in a subsearch of sorts. (And if the macro is in a subsearch, I cannot pass event time as a parameter, this means that I still don't get to match with events.)

Labels (2)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

@yuanliu I saw your post on Slack. Your original solution is not optimal for a number of reasons - although it may not be the solution, I believe this is a more efficient solution to your lookup version given that

  • you only ever want a zero or one values of milestone1 from the lookup, where
    • zero if _time is < the earliest milestone1
    • one, where the value is the closest value (in the past) of milestone1 to _time 

so, there's no need to use mvexpand/where/eventstats/where at all, as you can use mvmap/eval to filter milestone1 values to zero or one values and then lookup/where to filter those events where _time<milestone1

| makeresults count=1000
| eval _time=now() - ((random() % 240) * 86400)
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
``` this filters to only those dates after _time ```
| eval milestone1=mvmap(milestone1, if(_time > strptime(milestone1, "%F"), milestone1, null()))
``` and this takes the latest of them ```
| eval milestone1=max(milestone1)
| lookup release milestone1
| where isnotnull(release)
| sort _time

I used your data set of milestones for my lookup. 

I'm not sure if this is the best solution, but it avoids those expensive mvexpand/eventstats combinations.

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

This is wonderful, @bowesmana!  I didn't realize that statistical functions apply to any multivalue data, not just in stats commands.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Via the magic of posting😉, I made some progress that I hadn't been able to make in days.  But the performance is so poor.

As mentioned, I added a field "dummy" (with value "dumb") to the lookup so I can retrieve the entire table into every event.

| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
| mvexpand milestone1
| where _time > strptime(milestone1, "%F")
| eventstats max(milestone1) by _time
| where milestone1 == 'max(milestone1)'
| lookup release milestone1

This kind of get what I needed, but a search that returned < 9,000 events more than doubled search time (from 1.7s to 4s).  And this is only dealing with 5 values of milestone1.  As expected, eventstats seems to be the most expensive, followed by mvexpand.

Is there a better way to do this?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@yuanliu I saw your post on Slack. Your original solution is not optimal for a number of reasons - although it may not be the solution, I believe this is a more efficient solution to your lookup version given that

  • you only ever want a zero or one values of milestone1 from the lookup, where
    • zero if _time is < the earliest milestone1
    • one, where the value is the closest value (in the past) of milestone1 to _time 

so, there's no need to use mvexpand/where/eventstats/where at all, as you can use mvmap/eval to filter milestone1 values to zero or one values and then lookup/where to filter those events where _time<milestone1

| makeresults count=1000
| eval _time=now() - ((random() % 240) * 86400)
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
``` this filters to only those dates after _time ```
| eval milestone1=mvmap(milestone1, if(_time > strptime(milestone1, "%F"), milestone1, null()))
``` and this takes the latest of them ```
| eval milestone1=max(milestone1)
| lookup release milestone1
| where isnotnull(release)
| sort _time

I used your data set of milestones for my lookup. 

I'm not sure if this is the best solution, but it avoids those expensive mvexpand/eventstats combinations.

Get Updates on the Splunk Community!

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Get More Out of Your Security Practice With a SIEM

Get More Out of Your Security Practice With a SIEMWednesday, July 31, 2024  |  11AM PT / 2PM ETREGISTER ...