It's a little more complicated, probably could be tightened up but it works.
| makeresults
| eval mydata="1,apple,boy 2,apple,girl 3,boy,apple 4,boy,girl 5,girl,apple 6,boy,apple"
| makemv mydata
| mvexpand mydata
| makemv delim="," mydata
| eval theindex=mvindex(mydata,0)
| eval field1=mvindex(mydata,1)
| eval field2=mvindex(mydata,2)
| table theindex field1 field2
| eval matchsame=field2."!!!!".field1
| eval matchall=if(field1<field2,field1."!!!!".field2,field2."!!!!".field1)
| eventstats count as matchallcount by matchall
| where matchallcount>1
| stats list(theindex) as theindex list(matchsame) as matchsame by matchall
| eval theindex = mvzip(theindex,matchsame)
| table theindex
| eval saveindex=theindex
| mvexpand theindex
| mvexpand saveindex
| where theindex!=saveindex
| makemv delim="," theindex
| eval matchsame=mvindex(theindex,1)
| eval theindex=mvindex(theindex,0)
| makemv delim="," saveindex
| eval matchother=mvindex(saveindex,1)
| eval matchindex=mvindex(saveindex,0)
| table theindex matchsame matchindex matchother
| stats values(matchindex) as matchindex by theindex matchsame matchother
| sort 0 theindex
| eval matchstatement=if(matchsame=matchother," is a duplicate of "," matches with ")
| eval matchindex=mvjoin(matchindex,", ")
| eval mymessage=theindex.matchstatement.matchindex
| table mymessage
producing the following results -
1 matches with 3, 6
2 matches with 5
3 is a duplicate of 6
3 matches with 1
5 matches with 2
6 is a duplicate of 3
6 matches with 1
By the way, there's nothing splunky or magical about those four exclamation points ("!!!!"). I use those as a delimiter simply because for my particular installation, that particular combination of characters is highly unlikely to happen in my data. If your organization deals with tweets or teens or excitable people, then substitute something else - six percent signs, or whatever.
... View more