Solved: Can I save mvexpand when matching a multivalue loo...

yuanliu · ‎06-18-2021

I have a lookup that can return multivalue for two fields, one of them a timestamp, like this

key	text	date
key1	abc\|def	2021-04-06
key1	efg\|hij	2021-06-04
...

A lookup on key1 will return

key

text

date

key1

abc|def

efg|hij

2021-04-06

2021-12-31

I then want to match the text field as regex, and perform calculation on the matching date. For example, match(text, mytext) where mytext = "abc", and compare now() > strptime(date, "%Y-%m-%d").

I saw many mvexpand solutions in the past, and some mvjoin() solution. mvexpand is expensive for my data, and mvjoin() only works for regex on text.

Is there some way to do this without mvexpand?

ITWhisperer · ‎06-19-2021

Here is a runanywhere example - the first part sets up some dummy data with different combinations of text and dates; the second part effectively is the results of the lookup (and should be replaced with the real lookup; and, the last part does the mvzip and mvmap to process the matching to return which entry matched the text and date.

| makeresults 
| eval field=split("abc123,def234,cde345,abcdef987,efg234,abchij",",")
| mvexpand field
| eval time=split("2021-04-05,2021-04-06,2021-06-04,2021-06-05",",")
| mvexpand time


| appendcols 
    [| makeresults 
    | eval _raw="key	text	date
key1	abc|def	2021-04-06
key1	efg|hij	2021-06-04"
    | multikv forceheader=1
    | fields - _* linecount
    | stats list(*) as * by key]
| filldown key text date


| eval comb=mvzip(text,date,",")
| eval match=mvmap(comb,if(match(field,mvindex(split(comb,","),0)),if(time=mvindex(split(comb,","),1),comb,null),null))
| table time field key text date comb match

View solution in original post

yuanliu · ‎06-18-2021

One crazy solution is to use mvzip() to "tie" those potentially multivalue fields together, perform the match against tied string (considerations in not create false matches, too), then split the matched jumbo string to return text, date in the match.

| eval crazymv = mvzip(text, date, "|:crazy:|"),
crazymatch = mvfind(crazymv, "abc"),
crazysv = mvindex(crazymv, crazymatch)
crazysplit = split(crazysv, "|:crazy:|"), 1),
text = mvindex(crazysplit, 0),
date = mvindex(crazysplit, 1)

ITWhisperer · ‎06-18-2021

Can you mvzip text and date, then mvmap the zipped field where the function uses the two halves for a match and date comparison?

In what way is mvexpand "expensive"? If you need an alternative to mvexpand, I posted a solution here, although this was more to do with avoiding the limitations of mvexpand and may be just as "expensive" in your case as there are a lot of steps involved to perform the same action, but it does have the benefit of being able to expand more than one field at the same time.

yuanliu · ‎06-18-2021

Yes, mvzip(), then unzip (split()) is what I ended up doing. The whole operation feels crazy.

@ITWhisperer wrote:
In what way is mvexpand "expensive"? If you need an alternative to mvexpand, I posted a solution here, although this was more to do with avoiding the limitations of mvexpand and may be just as "expensive" in your case as there are a lot of steps involved to perform the same action, but it does have the benefit of being able to expand more than one field at the same time.

mvexpand multiples total number of events down the stream. In my example, there will be 4x original events. BTW, that link is broken.

ITWhisperer · ‎06-19-2021

Try this address https://community.splunk.com/t5/Splunk-Search/mvexpand-limits/m-p/549178

yuanliu · ‎06-18-2021

mvexpand multiples total number of events down the stream. In my example, there will be 4x original

I had thought that if I did mvexpand in a subsearch that had very few events, or even just synthetic events, but incorporated the lookup, mvexpand would be cheap inside. But I couldn't figure out how to make it do what I wanted. The mvzip() method was from a long time ago, but isn't very SPL-like.

ITWhisperer · ‎06-19-2021

Here is a runanywhere example - the first part sets up some dummy data with different combinations of text and dates; the second part effectively is the results of the lookup (and should be replaced with the real lookup; and, the last part does the mvzip and mvmap to process the matching to return which entry matched the text and date.

| makeresults 
| eval field=split("abc123,def234,cde345,abcdef987,efg234,abchij",",")
| mvexpand field
| eval time=split("2021-04-05,2021-04-06,2021-06-04,2021-06-05",",")
| mvexpand time


| appendcols 
    [| makeresults 
    | eval _raw="key	text	date
key1	abc|def	2021-04-06
key1	efg|hij	2021-06-04"
    | multikv forceheader=1
    | fields - _* linecount
    | stats list(*) as * by key]
| filldown key text date


| eval comb=mvzip(text,date,",")
| eval match=mvmap(comb,if(match(field,mvindex(split(comb,","),0)),if(time=mvindex(split(comb,","),1),comb,null),null))
| table time field key text date comb match

yuanliu · ‎06-19-2021

So, this solution took two elements that I wasn't familiar.

appendcols - compared to mvexpand, it doesn't increase number of events.
mvmap() - an enumerable feature in many languages that skipped my mind when you first mentioned it. Much cleaner than the crazy mvindex(), split() sequence that I had used.

To illustrate in pseudo code,

`THE search to return key and textfield for matching lookup`

| appendcols [
  lookup key_text_date key
]
| filldown key text date

| eval comb=mvzip(text, date, ",")
| eval match=mvmap(comb, if(match(textfield, mvindex(split(comb, ","), 0)), if(time=mvindex(split(comb, ","), 1), comb, null), null))

This is the SPLeque solution I am looking for. Thanks, @ITWhisperer

yuanliu · ‎06-19-2021

To illustrate in pseudo code,

...
| eval match=mvmap(comb, if(match(textfield, mvindex(split(comb, ","), 0)), if(time=mvindex(split(comb, ","), 1), comb, null), null))

Mmm I didn't inspect the code close enough. The last piece tries to match a nonexistent "time" field with the date field from lookup. The real use case of the date field is to be compared with something, not a precise match. `THE Search to return key and textfield for matching lookup` only returns "key" to match the exact "key" in lookup, and a "textfield" to be matched as regex against "text" in lookup. Like so

| makeresults count=4
| eval keyfield="key1", textfield=split("abc123,def234,cde345,abcdef987,efg234,abchij",",")
| mvexpand textfield
| streamstats count
| eval _time=_time - count * 864000

"date" in lookup is not used for matching events, but for comparison or other purposes, e.g., comparison and display.

As such, the general method of using mvzip - mvmap still stands, although another split is warranted to extract matching "date" for display purposes. (I'll use comparison _time <date as matching criterion in this illustration, although it can be any operation.)

`THE search to return key and textfield for matching lookup`

| lookup key_text_date key

| eval comb=mvzip(text, date, ",")
| eval match=mvmap(comb,
 if(match(textfield, mvindex(split(comb, ","), 0)),
 if(_time < strptime(mvindex(split(comb, ","), 1), "%Y-%m-%d"), comb, null), null)
)
| eval date = mvindex(split(match, ","), 1)

(I also realize that appendcol is just to simulate lookup; lookup effectively appends columns to search results.) If I need matching text value for display, another mvindex() will be needed. The code will look similarly complex, but this solution doesn't require consideration for the mvjoin() and split() delimiter.

yuanliu · ‎06-20-2021

(I'll use comparison _time <date as matching criterion in this illustration, although it can be any operation.)

| eval match=mvmap(comb,
 if(match(textfield, mvindex(split(comb, ","), 0)),
 if(_time < strptime(mvindex(split(comb, ","), 1), "%Y-%m-%d"), comb, null), null)
)
| eval date = mvindex(split(match, ","), 1)

Note on application: Sometimes it is better to leave the operation on date out of mvmap() altogether; instead, perform that operation AFTER obtaining single-value "date" field. This can be important if you are using flexible selectors in a dashboard. In fact, the resultant code is outright more readable.

`THE search to return key and textfield for matching lookup`
| lookup key_text_date key

| eval comb=mvzip(text, date, ",")
| eval textmatch=mvmap(comb, if(match(textfield, mvindex(split(comb, ","), 0)), comb, null))
| eval date = mvindex(split(textmatch, ","), 1)
| eval fullmatch=if(_time < strptime(date, "%Y-%m-%d"), "yes", null)

(One more note on syntax. I have gotten into the habit of using null() in evals, not realizing that a side effect of SPL syntax makes unquoted token "null" equivalent to null(). Of course, using side effect has its side effects. The unquoted token may as well be "full" and it will still be interpreted as null(); e.g., the last eval above can be written as

| eval fullmatch=if(_time < strptime(date, "%Y-%m-%d"), "yes", yes)

and remains semantically identical.)

ITWhisperer · ‎06-20-2021

Yes, the first part of my solution is replaced by your search, the second part is replaced by your lookup, the third part is modified to fit the comparisons you need in your specific use case. Sometimes, it is better to leave you with something to do, to give you an opportunity to understand the concepts behind the solution. 😀

Can I save mvexpand when matching a multivalue lookup?

lookup

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

Transform your security operations with Splunk Enterprise Security

Splunk Admins and App Developers | Earn a $35 gift card!