Splunk Search

How to compare search results with lookup for duplicates?

solaced
Explorer

I have a lookup which I want to compare search results against and find duplicate values.   How do I ignore duplicates found that already exist in my dataset? And only identify duplicates using results from my search compared to the dataset?

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2

Labels (1)
0 Karma
1 Solution

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

solaced
Explorer

Thank you.  I added eval from_lookup="0" to the base search which accomplishes the same as your answer. Though yours is more elegant 🙂 

I appreciate your help, thank you so much!

0 Karma

solaced
Explorer

Thanks @scelikok  I think it's on the right track.  Unfortunately the output does not contain any results now from my index search.  Which I would expect show an empty  from_lookup value.

It only contains events from the inputlookup table which have from_lookup = 1

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

Can you please try the below search? If I understood correctly I added an extra field to be able to filter values from lookup.

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND from_lookup="1"

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

solaced
Explorer

To be clear, I'm looking at the value phoneAndEmail and if it is used on multiple userName

I want to compare results from my index against my dataset, and not compare results WITHIN the dataset.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability as Code: From Zero to Dashboard

For the details on what Self-Service Observability and Observability as Code is, we have some awesome content ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Shape the Future of Splunk: Join the Product Research Lab!

Join the Splunk Product Research Lab and connect with us in the Slack channel #product-research-lab to get ...