Splunk Search

How to compare search results with lookup for duplicates?

solaced
Explorer

I have a lookup which I want to compare search results against and find duplicate values.   How do I ignore duplicates found that already exist in my dataset? And only identify duplicates using results from my search compared to the dataset?

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2

Labels (1)
0 Karma
1 Solution

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

solaced
Explorer

Thank you.  I added eval from_lookup="0" to the base search which accomplishes the same as your answer. Though yours is more elegant 🙂 

I appreciate your help, thank you so much!

0 Karma

solaced
Explorer

Thanks @scelikok  I think it's on the right track.  Unfortunately the output does not contain any results now from my index search.  Which I would expect show an empty  from_lookup value.

It only contains events from the inputlookup table which have from_lookup = 1

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

Can you please try the below search? If I understood correctly I added an extra field to be able to filter values from lookup.

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND from_lookup="1"

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

solaced
Explorer

To be clear, I'm looking at the value phoneAndEmail and if it is used on multiple userName

I want to compare results from my index against my dataset, and not compare results WITHIN the dataset.

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...