Splunk Search

How to search for field values that are returned in one search that don't appear in another search?

djconroy
Path Finder

I am trying to come up with the search syntax that would get me the the values of a field that exist in one search that don't exist in another search. IE:

SourceIP="*" earliest=-48h@h latest=-24h@h | stats count by SourceIP | fields SourceIP

and

SourceIP="*" earliest=-24h@h | stats count by SourceIP | fields SourceIP

I'm looking for the values of SourceIP that were present in search 1 that are not present in search 2.

Is there a way to do this that can process quickly? There should be about 11,000 unique values coming from each search. I just want to know which ones didn't occur today.

Tags (3)
1 Solution

djconroy
Path Finder

After looking through the set diff docs and searching Google I found this previous quesion:
http://answers.splunk.com/answers/33791/compare-field-from-2-sources-and-return-when-source1-has-no-...

That gave me the guidance to use a join outer and an if-match command to determine which clients were not communicating in the last 24 hours that should be:

| index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-48h@h latest=-24h@h
| dedup SourceIP
|  eval Client=SourceIP
| eval status1="Active"
| join Client type=outer [search index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-24h@h
| dedup SourceIP
| eval Client=SourceIP
| eval status2 = "Active" ]
| eval Status = if(match(status1,status2), "Active", "Not Responding")
| table Client Status
| where Status="Not Responding"

View solution in original post

woodcock
Esteemed Legend

All other answers are subject to very low limits (50K rows) and will be incorrect for even modest set sizes. Here is a way to do it in a vastly less limited way:

|multisearch
[ SourceIP="*" earliest=-48h@h latest=-24h@h | stats count by SourceIP | fields SourceIP | eval type="keepers" | outputcsv keepers.csv]
[ SourceIP="*" earliest=-24h@h | stats count by SourceIP | fields SourceIP | eval type="droppers" | outputcsv droppers.csv]
| search thisFieldWillNeverExist="So this will drop all events"
| appendpipe [|inputcsv keepers.csv]
| appendpipe [|inputcsv droppers.csv]
| stats values(*) AS * BY SourceIP
| search type="keepers" NOT type="droppers"
0 Karma

djconroy
Path Finder

After looking through the set diff docs and searching Google I found this previous quesion:
http://answers.splunk.com/answers/33791/compare-field-from-2-sources-and-return-when-source1-has-no-...

That gave me the guidance to use a join outer and an if-match command to determine which clients were not communicating in the last 24 hours that should be:

| index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-48h@h latest=-24h@h
| dedup SourceIP
|  eval Client=SourceIP
| eval status1="Active"
| join Client type=outer [search index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-24h@h
| dedup SourceIP
| eval Client=SourceIP
| eval status2 = "Active" ]
| eval Status = if(match(status1,status2), "Active", "Not Responding")
| table Client Status
| where Status="Not Responding"

aljohnson_splun
Splunk Employee
Splunk Employee

Have you looked at the diff command?

0 Karma

djconroy
Path Finder

I am looking at it now... theoretically, how would I send the values from the search into the set diff command?

Essentially I would need to pipe 11,000 different values into each subsearch.

Also, I see that the set diff command works for less than 10,000 results.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...