Splunk Search

How to compare events with "milestone" lookup?

yuanliu
SplunkTrust
SplunkTrust

I have a really simple task but haven't figured out how.  This is a simple table of milestones

milestone1milestone2milestone3release
2022-01-302022-02-282022-03-251_0
2022-04-202022-05-102022-05-251_1
2022-07-022022-07-212022-08-141_2
2022-09-202022-10-142022-11-031_3
2022-12-212023-01-112023-01-312_0

I need to determine the "release" cycle a given event is in, and perform some calculations in relationship to milestones.  For illustration purposes, let's say if an event is in between milestone1 of 1_1 and 1_2 (2022-04-20 and 2022-07-02), I'll say it belongs in release cycle 1_1.  In other words, milestone2, milestone3, etc., can be considered mere attributes that I need to retrieve. (In the real world, some columns are not dates.)

Initially I thought a simple lookup would suffice.  But after various trials, I have made little progress.  If I could devise a macro based on the lookup table to output the value of release, I can certainly then lookup in the table to obtain the rest of attributes.  I even thought of adding a dummy (constant value) column so I could retrieve the entire table with every event.  But even with that, I still couldn't find an easy way to match event with row.

The best I have come up with so far is to determine the current release by comparing | inputlookup with now(), like this

 

| inputlookup release
| where now() > strptime(milestone1, "%F")
| eventstats max(release) as current_release
| where release == current_release

 

If milestone1 is in epoc time, the search could be simpler but in any case, this only gives now, and I cannot really use it in a macro unless the macro is placed in a subsearch of sorts. (And if the macro is in a subsearch, I cannot pass event time as a parameter, this means that I still don't get to match with events.)

Labels (2)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

@yuanliu I saw your post on Slack. Your original solution is not optimal for a number of reasons - although it may not be the solution, I believe this is a more efficient solution to your lookup version given that

  • you only ever want a zero or one values of milestone1 from the lookup, where
    • zero if _time is < the earliest milestone1
    • one, where the value is the closest value (in the past) of milestone1 to _time 

so, there's no need to use mvexpand/where/eventstats/where at all, as you can use mvmap/eval to filter milestone1 values to zero or one values and then lookup/where to filter those events where _time<milestone1

| makeresults count=1000
| eval _time=now() - ((random() % 240) * 86400)
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
``` this filters to only those dates after _time ```
| eval milestone1=mvmap(milestone1, if(_time > strptime(milestone1, "%F"), milestone1, null()))
``` and this takes the latest of them ```
| eval milestone1=max(milestone1)
| lookup release milestone1
| where isnotnull(release)
| sort _time

I used your data set of milestones for my lookup. 

I'm not sure if this is the best solution, but it avoids those expensive mvexpand/eventstats combinations.

View solution in original post

yuanliu
SplunkTrust
SplunkTrust

This is wonderful, @bowesmana!  I didn't realize that statistical functions apply to any multivalue data, not just in stats commands.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Via the magic of posting😉, I made some progress that I hadn't been able to make in days.  But the performance is so poor.

As mentioned, I added a field "dummy" (with value "dumb") to the lookup so I can retrieve the entire table into every event.

| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
| mvexpand milestone1
| where _time > strptime(milestone1, "%F")
| eventstats max(milestone1) by _time
| where milestone1 == 'max(milestone1)'
| lookup release milestone1

This kind of get what I needed, but a search that returned < 9,000 events more than doubled search time (from 1.7s to 4s).  And this is only dealing with 5 values of milestone1.  As expected, eventstats seems to be the most expensive, followed by mvexpand.

Is there a better way to do this?

0 Karma

bowesmana
SplunkTrust
SplunkTrust

@yuanliu I saw your post on Slack. Your original solution is not optimal for a number of reasons - although it may not be the solution, I believe this is a more efficient solution to your lookup version given that

  • you only ever want a zero or one values of milestone1 from the lookup, where
    • zero if _time is < the earliest milestone1
    • one, where the value is the closest value (in the past) of milestone1 to _time 

so, there's no need to use mvexpand/where/eventstats/where at all, as you can use mvmap/eval to filter milestone1 values to zero or one values and then lookup/where to filter those events where _time<milestone1

| makeresults count=1000
| eval _time=now() - ((random() % 240) * 86400)
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
``` this filters to only those dates after _time ```
| eval milestone1=mvmap(milestone1, if(_time > strptime(milestone1, "%F"), milestone1, null()))
``` and this takes the latest of them ```
| eval milestone1=max(milestone1)
| lookup release milestone1
| where isnotnull(release)
| sort _time

I used your data set of milestones for my lookup. 

I'm not sure if this is the best solution, but it avoids those expensive mvexpand/eventstats combinations.

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...