Hello Splunkers,
I am trying to compare two multi value ID columns, and return true when at least of the values matches between these 2 ID columns.
For example:
ID1 | ID2 | Match |
402830 602369 |
602369 244633 |
TRUE |
402830 840317 602369 |
602369 244633 |
TRUE |
152893 443482 |
602369 244633 |
FALSE |
227213 244633 |
602369 244633 |
TRUE |
422210 442824 |
602369 244633 |
FALSE |
The question is how to create the Match column by comparing ID1 to ID2. They are both multi value fields, and one field could contain up to 25 values. As long as there is one match between ID1 and ID2, the match returns TRUE.
I have tried match() and mvfind(), but haven't found any luck.
Thanks all!
Similar to @bowesmana's idea but using a "feature" (or side effect if you like) of SPL's array operation, you can do
| eval Match = mvmap(ID1, if(ID2 == ID1, "TRUE", null())) ``` null only if all tests fail ```
| eval Match = if(isnull(Match), "FALSE", "TRUE")
Interestingly, when SPL evaluates equality between a single value with multivalue, it evaluates to true when any one of multivalue is true. The advantage here is regex match in mvfind can give you false match unless you format the parameters very carefully, or if all IDs are of equal length. For example, "123" can match "12345". (Additionally, regex is more expensive than simple ASCII equality.)
Here is a demonstration with emulation of your sample data:
| makeresults
| eval _raw="ID1 ID2
402830,602369 602369,244633
402830,840317,602369 602369,244633
152893,443482 602369,244633
227213,244633 602369,244633
422210,442824 602369,244633"
| multikv forceheader=1
| fields - _time _raw linecount
| foreach ID*
[eval <<FIELD>> = split(<<FIELD>>, ",")]
``` data emulation above ```
| eval Match = mvmap(ID1, if(ID2 == ID1, "TRUE", null()))
| eval Match = if(isnull(Match), "FALSE", "TRUE")
You get
ID1 | ID2 | Match |
402830 602369 | 602369 244633 | TRUE |
402830 840317 602369 | 602369 244633 | TRUE |
152893 443482 | 602369 244633 | FALSE |
227213 244633 | 602369 244633 | TRUE |
422210 442824 | 602369 244633 | FALSE |
Use this
| eval match=max(mvmap(ID1, if(isnotnull(mvfind(ID2, ID1)), 1, 0)))
value 1 is a match
Here's a working example with your data
| makeresults
| eval t=split("402830,602369|602369,244633;402830,840317,602369|602369,244633;152893,443482|602369,244633;227213,244633|602369,244633;422210,442824|602369,244633", ";")
| mvexpand t
| eval IDs=split(t, "|")
| eval ID1=split(mvindex(IDs, 0), ","), ID2=split(mvindex(IDs, 1), ",")
| table ID1 ID2
| eval match=max(mvmap(ID1, if(isnotnull(mvfind(ID2, ID1)), 1, 0)))
Combine the multivalued fields, take a count, then dedup and count again. If the count goes down after deduping, you have a match.
<base_search>
| eval id_combined=MVAPPEND(ID1, ID2)
| eval id_ct=MVCOUNT(id_combined)
| eval id_combined=MVDEDUP(id_combined)
| eval id_dc=MVCOUNT(id_combined)
| eval match=IF(id_ct>id_dc, "TRUE", "FALSE")