I have a really simple task but haven't figured out how. This is a simple table of milestones
milestone1 | milestone2 | milestone3 | release |
2022-01-30 | 2022-02-28 | 2022-03-25 | 1_0 |
2022-04-20 | 2022-05-10 | 2022-05-25 | 1_1 |
2022-07-02 | 2022-07-21 | 2022-08-14 | 1_2 |
2022-09-20 | 2022-10-14 | 2022-11-03 | 1_3 |
2022-12-21 | 2023-01-11 | 2023-01-31 | 2_0 |
I need to determine the "release" cycle a given event is in, and perform some calculations in relationship to milestones. For illustration purposes, let's say if an event is in between milestone1 of 1_1 and 1_2 (2022-04-20 and 2022-07-02), I'll say it belongs in release cycle 1_1. In other words, milestone2, milestone3, etc., can be considered mere attributes that I need to retrieve. (In the real world, some columns are not dates.)
Initially I thought a simple lookup would suffice. But after various trials, I have made little progress. If I could devise a macro based on the lookup table to output the value of release, I can certainly then lookup in the table to obtain the rest of attributes. I even thought of adding a dummy (constant value) column so I could retrieve the entire table with every event. But even with that, I still couldn't find an easy way to match event with row.
The best I have come up with so far is to determine the current release by comparing | inputlookup with now(), like this
| inputlookup release
| where now() > strptime(milestone1, "%F")
| eventstats max(release) as current_release
| where release == current_release
If milestone1 is in epoc time, the search could be simpler but in any case, this only gives now, and I cannot really use it in a macro unless the macro is placed in a subsearch of sorts. (And if the macro is in a subsearch, I cannot pass event time as a parameter, this means that I still don't get to match with events.)
@yuanliu I saw your post on Slack. Your original solution is not optimal for a number of reasons - although it may not be the solution, I believe this is a more efficient solution to your lookup version given that
so, there's no need to use mvexpand/where/eventstats/where at all, as you can use mvmap/eval to filter milestone1 values to zero or one values and then lookup/where to filter those events where _time<milestone1
| makeresults count=1000
| eval _time=now() - ((random() % 240) * 86400)
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
``` this filters to only those dates after _time ```
| eval milestone1=mvmap(milestone1, if(_time > strptime(milestone1, "%F"), milestone1, null()))
``` and this takes the latest of them ```
| eval milestone1=max(milestone1)
| lookup release milestone1
| where isnotnull(release)
| sort _time
I used your data set of milestones for my lookup.
I'm not sure if this is the best solution, but it avoids those expensive mvexpand/eventstats combinations.
This is wonderful, @bowesmana! I didn't realize that statistical functions apply to any multivalue data, not just in stats commands.
Via the magic of posting😉, I made some progress that I hadn't been able to make in days. But the performance is so poor.
As mentioned, I added a field "dummy" (with value "dumb") to the lookup so I can retrieve the entire table into every event.
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
| mvexpand milestone1
| where _time > strptime(milestone1, "%F")
| eventstats max(milestone1) by _time
| where milestone1 == 'max(milestone1)'
| lookup release milestone1
This kind of get what I needed, but a search that returned < 9,000 events more than doubled search time (from 1.7s to 4s). And this is only dealing with 5 values of milestone1. As expected, eventstats seems to be the most expensive, followed by mvexpand.
Is there a better way to do this?
@yuanliu I saw your post on Slack. Your original solution is not optimal for a number of reasons - although it may not be the solution, I believe this is a more efficient solution to your lookup version given that
so, there's no need to use mvexpand/where/eventstats/where at all, as you can use mvmap/eval to filter milestone1 values to zero or one values and then lookup/where to filter those events where _time<milestone1
| makeresults count=1000
| eval _time=now() - ((random() % 240) * 86400)
| eval dummy="dumb"
| lookup release dummy OUTPUT milestone1
``` this filters to only those dates after _time ```
| eval milestone1=mvmap(milestone1, if(_time > strptime(milestone1, "%F"), milestone1, null()))
``` and this takes the latest of them ```
| eval milestone1=max(milestone1)
| lookup release milestone1
| where isnotnull(release)
| sort _time
I used your data set of milestones for my lookup.
I'm not sure if this is the best solution, but it avoids those expensive mvexpand/eventstats combinations.