Splunk Search

How to compare search results with lookup for duplicates?

solaced
Explorer

I have a lookup which I want to compare search results against and find duplicate values.   How do I ignore duplicates found that already exist in my dataset? And only identify duplicates using results from my search compared to the dataset?

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2

Labels (1)
0 Karma
1 Solution

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

solaced
Explorer

Thank you.  I added eval from_lookup="0" to the base search which accomplishes the same as your answer. Though yours is more elegant 🙂 

I appreciate your help, thank you so much!

0 Karma

solaced
Explorer

Thanks @scelikok  I think it's on the right track.  Unfortunately the output does not contain any results now from my index search.  Which I would expect show an empty  from_lookup value.

It only contains events from the inputlookup table which have from_lookup = 1

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

Can you please try the below search? If I understood correctly I added an extra field to be able to filter values from lookup.

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND from_lookup="1"

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

solaced
Explorer

To be clear, I'm looking at the value phoneAndEmail and if it is used on multiple userName

I want to compare results from my index against my dataset, and not compare results WITHIN the dataset.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...