Splunk Search

How to search for field values that are returned in one search that don't appear in another search?

djconroy
Path Finder

I am trying to come up with the search syntax that would get me the the values of a field that exist in one search that don't exist in another search. IE:

SourceIP="*" earliest=-48h@h latest=-24h@h | stats count by SourceIP | fields SourceIP

and

SourceIP="*" earliest=-24h@h | stats count by SourceIP | fields SourceIP

I'm looking for the values of SourceIP that were present in search 1 that are not present in search 2.

Is there a way to do this that can process quickly? There should be about 11,000 unique values coming from each search. I just want to know which ones didn't occur today.

Tags (3)
1 Solution

djconroy
Path Finder

After looking through the set diff docs and searching Google I found this previous quesion:
http://answers.splunk.com/answers/33791/compare-field-from-2-sources-and-return-when-source1-has-no-...

That gave me the guidance to use a join outer and an if-match command to determine which clients were not communicating in the last 24 hours that should be:

| index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-48h@h latest=-24h@h
| dedup SourceIP
|  eval Client=SourceIP
| eval status1="Active"
| join Client type=outer [search index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-24h@h
| dedup SourceIP
| eval Client=SourceIP
| eval status2 = "Active" ]
| eval Status = if(match(status1,status2), "Active", "Not Responding")
| table Client Status
| where Status="Not Responding"

View solution in original post

woodcock
Esteemed Legend

All other answers are subject to very low limits (50K rows) and will be incorrect for even modest set sizes. Here is a way to do it in a vastly less limited way:

|multisearch
[ SourceIP="*" earliest=-48h@h latest=-24h@h | stats count by SourceIP | fields SourceIP | eval type="keepers" | outputcsv keepers.csv]
[ SourceIP="*" earliest=-24h@h | stats count by SourceIP | fields SourceIP | eval type="droppers" | outputcsv droppers.csv]
| search thisFieldWillNeverExist="So this will drop all events"
| appendpipe [|inputcsv keepers.csv]
| appendpipe [|inputcsv droppers.csv]
| stats values(*) AS * BY SourceIP
| search type="keepers" NOT type="droppers"
0 Karma

djconroy
Path Finder

After looking through the set diff docs and searching Google I found this previous quesion:
http://answers.splunk.com/answers/33791/compare-field-from-2-sources-and-return-when-source1-has-no-...

That gave me the guidance to use a join outer and an if-match command to determine which clients were not communicating in the last 24 hours that should be:

| index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-48h@h latest=-24h@h
| dedup SourceIP
|  eval Client=SourceIP
| eval status1="Active"
| join Client type=outer [search index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-24h@h
| dedup SourceIP
| eval Client=SourceIP
| eval status2 = "Active" ]
| eval Status = if(match(status1,status2), "Active", "Not Responding")
| table Client Status
| where Status="Not Responding"

aljohnson_splun
Splunk Employee
Splunk Employee

Have you looked at the diff command?

0 Karma

djconroy
Path Finder

I am looking at it now... theoretically, how would I send the values from the search into the set diff command?

Essentially I would need to pipe 11,000 different values into each subsearch.

Also, I see that the set diff command works for less than 10,000 results.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...