Splunk Search

Help finding new solution when the lookup file contains strings and regular expressions

tehong
Explorer

Hello guys!!

I have a question about the lookup command when the lookup file contains strings and regular expressions.

The following is an example.

field var_1 : String
field var_2 : String
field var_3 : Regex or String
field var_4 : String

lookup file

------lookup file-----------------------------
var_1, var_2, var_3, var_4
data10, data11, .+(:?aaa|bbb), data13
data20, data21, .+(:?ccc|ddd|eee), data23
data30, data31, .+(:?eee)fff+(:?ggg|hhh), data33
--------------------------------------------------

I would like to return var_4 when var_1, var_2, and var_3 are matched by the lookup command, but var_3 may contain a regular expression, and the lookup needs to match the condition of the regular expression.
As you know, regular expressions are not allowed in the lookup-field in the lookup command.

↓↓↓ Regular expressions cannot be used ↓↓↓

| makeresults
| eval var_1 = "data10", var_2 = "data11" , var_3 = "ABC123aaa"
| lookup var_1 var_2 var_3 OUTPUT var_4

It is necessary to use the lookup file (csv). If the lookup command is not the best way to solve this problem, then another command such as join is fine to use. Obviously, I don’t intend to use only the lookup command. I’m looking for other ways to do it as well.

Can someone please help me with this?

Thanks in advance!!

Labels (5)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Not long ago, ITWhisperer helped me solve the same problem (https://community.splunk.com/t5/Splunk-Search/Can-I-save-mvexpand-when-matching-a-multivalue-lookup/...).  There were other solutions involving mvexpand or mvjoin in the past. (My use case is complicated by the requirement that date comparison is required for one of the returned fields.)  It was my desire to avoid mvexpand.  ITWhisperer's solution uses appendcols, filldown, mvmap, mvzip, mvindex, and split.

In my real code, I decided to skip appendcols and filldown, just stick with crazy zip and unzip. (I have more fields to extract.) To apply to your situation, 

 

| lookup mylookup var_1 var_2 OUTPUTNEW var_3 AS var_3_regex var_4
| eval crazystring = mvzip(mvzip(mvzip(var_3_regex, var_1, "|:crazy:|"), var_2, "|:crazy:|"), var_3_regex, "|:crazy:|")
| eval crazymatch = mvmap(crazystring, if(match(var_3, crazystring), crazystring, null()))
| eventstats values(crazymatch) as crazymatch_sorted by var_1 var_2 var_4 ``` can substitute with stats to improve performance ```
| eval crazymatch = mvindex(crazymatch_sorted, -1)
| eval var4 = mvmap(crazymatch, mvindex(split(crazymatch, "|:crazy:|"), 3))
| fields - crazy*

 

Note: the string "|:crazy:|" used in mvzip is to avoid incidental matches; using conventional delimiters like "," can easily land you in such trouble.

The idea is the same as PickleRick mentioned, except the combination of mvmap() and mvindex() allows extraction when multiple var_1 and var_2 values match.

View solution in original post

Tags (4)

yuanliu
SplunkTrust
SplunkTrust

Not long ago, ITWhisperer helped me solve the same problem (https://community.splunk.com/t5/Splunk-Search/Can-I-save-mvexpand-when-matching-a-multivalue-lookup/...).  There were other solutions involving mvexpand or mvjoin in the past. (My use case is complicated by the requirement that date comparison is required for one of the returned fields.)  It was my desire to avoid mvexpand.  ITWhisperer's solution uses appendcols, filldown, mvmap, mvzip, mvindex, and split.

In my real code, I decided to skip appendcols and filldown, just stick with crazy zip and unzip. (I have more fields to extract.) To apply to your situation, 

 

| lookup mylookup var_1 var_2 OUTPUTNEW var_3 AS var_3_regex var_4
| eval crazystring = mvzip(mvzip(mvzip(var_3_regex, var_1, "|:crazy:|"), var_2, "|:crazy:|"), var_3_regex, "|:crazy:|")
| eval crazymatch = mvmap(crazystring, if(match(var_3, crazystring), crazystring, null()))
| eventstats values(crazymatch) as crazymatch_sorted by var_1 var_2 var_4 ``` can substitute with stats to improve performance ```
| eval crazymatch = mvindex(crazymatch_sorted, -1)
| eval var4 = mvmap(crazymatch, mvindex(split(crazymatch, "|:crazy:|"), 3))
| fields - crazy*

 

Note: the string "|:crazy:|" used in mvzip is to avoid incidental matches; using conventional delimiters like "," can easily land you in such trouble.

The idea is the same as PickleRick mentioned, except the combination of mvmap() and mvindex() allows extraction when multiple var_1 and var_2 values match.

Tags (4)

PickleRick
SplunkTrust
SplunkTrust

Nice idea (forgot about the mvzip completely) but it needs a properly "cooked" lookup. You effectively have a lookup by two fields and then additional contidion matching on third field (done pretty nice I admit :-)).

PickleRick
SplunkTrust
SplunkTrust

If you could guarantee uniqueness of var_1,var_2 pairs with var_3 as an additional condition whether that single var_1, var_2 pair matches or not, you could do that by looking up var_3 as an output from the lookup together with var_4 and verifying the match. But if you want to have many pairs of the same var_1 and var_2 with different var_3... I don't see any reasonable way to do so short of writing your own external lookup command.

Yes, theoretically you could doing strange aerobatics with inputlookup, rendering the whole lookup to a flar vector, then iterating over the fields to find match manually but this idea is wrong on so many levels...

 

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...