Splunk Search

How to compare search results with lookup for duplicates?

solaced
Explorer

I have a lookup which I want to compare search results against and find duplicate values.   How do I ignore duplicates found that already exist in my dataset? And only identify duplicates using results from my search compared to the dataset?

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2

Labels (1)
0 Karma
1 Solution

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

solaced
Explorer

Thank you.  I added eval from_lookup="0" to the base search which accomplishes the same as your answer. Though yours is more elegant 🙂 

I appreciate your help, thank you so much!

0 Karma

solaced
Explorer

Thanks @scelikok  I think it's on the right track.  Unfortunately the output does not contain any results now from my index search.  Which I would expect show an empty  from_lookup value.

It only contains events from the inputlookup table which have from_lookup = 1

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

Can you please try the below search? If I understood correctly I added an extra field to be able to filter values from lookup.

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND from_lookup="1"

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

solaced
Explorer

To be clear, I'm looking at the value phoneAndEmail and if it is used on multiple userName

I want to compare results from my index against my dataset, and not compare results WITHIN the dataset.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...

Customer success is front and center at .conf25

Hi Splunkers, If you are not able to be at .conf25 in person, you can still learn about all the latest news ...