Splunk Search

How to search for field values that appears in another field and vice versa?

chensy
Engager

Basically what I'm trying to ask is, for instance if my data is like this

index, field1, field2
1, apple, boy
2, apple, girl
3, boy, apple
4, boy, girl
5, girl, apple

How do i format my query to give me a results that says:

1 matches with 3
2 matches with 5
3 matches with 1 (optional but good to have)
5 matches with 2 (optional but good to have)

Thanks for your help.

0 Karma
1 Solution

DalJeanis
SplunkTrust
SplunkTrust

This part makes your test data

| makeresults 
| eval mydata="1,apple,boy 2,apple,girl 3,boy,apple 4,boy,girl 5,girl,apple"
| makemv mydata 
| mvexpand mydata 
| makemv delim="," mydata 
| eval theindex=mvindex(mydata,0)
| eval field1=mvindex(mydata,1) 
| eval field2=mvindex(mydata,2)
| table theindex field1 field2

This part builds a synthetic key to match by, finds all matches, then reports on them.

| eval concatvalues=if(field1<field2,field1."!!!!".field2,field2."!!!!".field1)
| stats values(theindex) as theindex by concatvalues
| where mvcount(theindex)>1
| eval saveindex = theindex
| mvexpand theindex
| mvexpand saveindex
| where theindex!=saveindex
| sort 0 theindex saveindex
| eval mymessage=theindex." matches with ".saveindex
| table mymessage

Note- the above will also work appropriately with multiples (more than two matches). For instance, a match-3 will result in 6 lines, a match-4 results in 12 lines, etc. To collapse those multiple lines into one line per index would require a little additional code as below -

 | makeresults 
 | eval mydata="1,apple,boy 2,apple,girl 3,boy,apple 4,boy,girl 5,girls,apple 6,boy,apple 7,apple,boy"
 | makemv mydata 
 | mvexpand mydata 
 | makemv delim="," mydata 
 | eval theindex=mvindex(mydata,0)
 | eval field1=mvindex(mydata,1) 
 | eval field2=mvindex(mydata,2)
 | table theindex field1 field2


 | eval concatvalues=if(field1<field2,field1."!!!!".field2,field2."!!!!".field1)
 | stats values(theindex) as theindex by concatvalues
 | where mvcount(theindex)>1
 | eval saveindex = theindex
 | mvexpand theindex
 | mvexpand saveindex
 | where theindex!=saveindex
 | stats values(saveindex) as saveindex by theindex
 | eval saveindex=mvjoin(saveindex,", ")
 | sort 0 theindex
 | eval mymessage=theindex." matches with ".saveindex
 | table mymessage

produces

1 matches with 3, 6, 7
2 matches with 5
3 matches with 1, 6, 7
5 matches with 2
6 matches with 1, 3, 7
7 matches with 1, 3, 6

View solution in original post

DalJeanis
SplunkTrust
SplunkTrust

This part makes your test data

| makeresults 
| eval mydata="1,apple,boy 2,apple,girl 3,boy,apple 4,boy,girl 5,girl,apple"
| makemv mydata 
| mvexpand mydata 
| makemv delim="," mydata 
| eval theindex=mvindex(mydata,0)
| eval field1=mvindex(mydata,1) 
| eval field2=mvindex(mydata,2)
| table theindex field1 field2

This part builds a synthetic key to match by, finds all matches, then reports on them.

| eval concatvalues=if(field1<field2,field1."!!!!".field2,field2."!!!!".field1)
| stats values(theindex) as theindex by concatvalues
| where mvcount(theindex)>1
| eval saveindex = theindex
| mvexpand theindex
| mvexpand saveindex
| where theindex!=saveindex
| sort 0 theindex saveindex
| eval mymessage=theindex." matches with ".saveindex
| table mymessage

Note- the above will also work appropriately with multiples (more than two matches). For instance, a match-3 will result in 6 lines, a match-4 results in 12 lines, etc. To collapse those multiple lines into one line per index would require a little additional code as below -

 | makeresults 
 | eval mydata="1,apple,boy 2,apple,girl 3,boy,apple 4,boy,girl 5,girls,apple 6,boy,apple 7,apple,boy"
 | makemv mydata 
 | mvexpand mydata 
 | makemv delim="," mydata 
 | eval theindex=mvindex(mydata,0)
 | eval field1=mvindex(mydata,1) 
 | eval field2=mvindex(mydata,2)
 | table theindex field1 field2


 | eval concatvalues=if(field1<field2,field1."!!!!".field2,field2."!!!!".field1)
 | stats values(theindex) as theindex by concatvalues
 | where mvcount(theindex)>1
 | eval saveindex = theindex
 | mvexpand theindex
 | mvexpand saveindex
 | where theindex!=saveindex
 | stats values(saveindex) as saveindex by theindex
 | eval saveindex=mvjoin(saveindex,", ")
 | sort 0 theindex
 | eval mymessage=theindex." matches with ".saveindex
 | table mymessage

produces

1 matches with 3, 6, 7
2 matches with 5
3 matches with 1, 6, 7
5 matches with 2
6 matches with 1, 3, 7
7 matches with 1, 3, 6

chensy
Engager

@DalJeanis thank you for your help! works great.

1 question though, how can I then differentiate between duplicated entries from matched entries?

e.g.

1 matches with 3, 6 
1 is duplicate of 7 
2 matches with 5 
3 matches with 1, 7
3 is duplicate of 6 
5 matches with 2 
6 matches with 1, 7
6 is duplicate of 3
7 matches with 3, 6 
7 is duplicate of 1

appreciate your expertise in this area

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

It's a little more complicated, probably could be tightened up but it works.

| makeresults 
| eval mydata="1,apple,boy 2,apple,girl 3,boy,apple 4,boy,girl 5,girl,apple 6,boy,apple"
| makemv mydata 
| mvexpand mydata 
| makemv delim="," mydata 
| eval theindex=mvindex(mydata,0)
| eval field1=mvindex(mydata,1) 
| eval field2=mvindex(mydata,2)
| table theindex field1 field2

| eval matchsame=field2."!!!!".field1
| eval matchall=if(field1<field2,field1."!!!!".field2,field2."!!!!".field1)
| eventstats count as matchallcount by matchall
| where matchallcount>1
| stats list(theindex) as theindex list(matchsame) as matchsame by matchall
| eval theindex = mvzip(theindex,matchsame)
| table theindex
| eval saveindex=theindex
| mvexpand theindex
| mvexpand saveindex
| where theindex!=saveindex
| makemv delim="," theindex
| eval matchsame=mvindex(theindex,1)
| eval theindex=mvindex(theindex,0)
| makemv delim="," saveindex
| eval matchother=mvindex(saveindex,1)
| eval matchindex=mvindex(saveindex,0)
| table theindex matchsame matchindex matchother
| stats values(matchindex) as matchindex by theindex matchsame matchother
| sort 0 theindex
| eval matchstatement=if(matchsame=matchother," is a duplicate of "," matches with ") 
| eval matchindex=mvjoin(matchindex,", ")
| eval mymessage=theindex.matchstatement.matchindex
| table mymessage

producing the following results -

1 matches with 3, 6
2 matches with 5
3 is a duplicate of 6
3 matches with 1
5 matches with 2
6 is a duplicate of 3
6 matches with 1

By the way, there's nothing splunky or magical about those four exclamation points ("!!!!"). I use those as a delimiter simply because for my particular installation, that particular combination of characters is highly unlikely to happen in my data. If your organization deals with tweets or teens or excitable people, then substitute something else - six percent signs, or whatever.

0 Karma

niketn
Legend

You can apply bubble sort before adding the two fields. Using eval if perform a string comparison of field1 and field2 and whichever is smaller should be first field and bigger should be second field.

<Your Base Search returning id field1 and field2>
| eval sortedFields=if(field1>field2,field2+field1,field1+field2)
| stats count as match values(id) as id values(field1) as field1 values(field2) as field2 by sortedFields

All the rows with count greater than 1 are matches and id column has the IDs that matched.

| search match>1
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@chensy... Please confirm if you were able to try out above and whether it solves the issue faced by you or not.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

chensy
Engager

hello @niketnilay, thanks for your help. Please see @DalJeanis solution from below

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In the last month, the Splunk Threat Research Team (STRT) has had 2 releases of new security content via the ...

Announcing the 1st Round Champion’s Tribute Winners of the Great Resilience Quest

We are happy to announce the 20 lucky questers who are selected to be the first round of Champion's Tribute ...

We’ve Got Education Validation!

Are you feeling it? All the career-boosting benefits of up-skilling with Splunk? It’s not just a feeling, it's ...