Splunk Search

How to search for field values that are returned in one search that don't appear in another search?

djconroy
Path Finder

I am trying to come up with the search syntax that would get me the the values of a field that exist in one search that don't exist in another search. IE:

SourceIP="*" earliest=-48h@h latest=-24h@h | stats count by SourceIP | fields SourceIP

and

SourceIP="*" earliest=-24h@h | stats count by SourceIP | fields SourceIP

I'm looking for the values of SourceIP that were present in search 1 that are not present in search 2.

Is there a way to do this that can process quickly? There should be about 11,000 unique values coming from each search. I just want to know which ones didn't occur today.

Tags (3)
1 Solution

djconroy
Path Finder

After looking through the set diff docs and searching Google I found this previous quesion:
http://answers.splunk.com/answers/33791/compare-field-from-2-sources-and-return-when-source1-has-no-...

That gave me the guidance to use a join outer and an if-match command to determine which clients were not communicating in the last 24 hours that should be:

| index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-48h@h latest=-24h@h
| dedup SourceIP
|  eval Client=SourceIP
| eval status1="Active"
| join Client type=outer [search index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-24h@h
| dedup SourceIP
| eval Client=SourceIP
| eval status2 = "Active" ]
| eval Status = if(match(status1,status2), "Active", "Not Responding")
| table Client Status
| where Status="Not Responding"

View solution in original post

woodcock
Esteemed Legend

All other answers are subject to very low limits (50K rows) and will be incorrect for even modest set sizes. Here is a way to do it in a vastly less limited way:

|multisearch
[ SourceIP="*" earliest=-48h@h latest=-24h@h | stats count by SourceIP | fields SourceIP | eval type="keepers" | outputcsv keepers.csv]
[ SourceIP="*" earliest=-24h@h | stats count by SourceIP | fields SourceIP | eval type="droppers" | outputcsv droppers.csv]
| search thisFieldWillNeverExist="So this will drop all events"
| appendpipe [|inputcsv keepers.csv]
| appendpipe [|inputcsv droppers.csv]
| stats values(*) AS * BY SourceIP
| search type="keepers" NOT type="droppers"
0 Karma

djconroy
Path Finder

After looking through the set diff docs and searching Google I found this previous quesion:
http://answers.splunk.com/answers/33791/compare-field-from-2-sources-and-return-when-source1-has-no-...

That gave me the guidance to use a join outer and an if-match command to determine which clients were not communicating in the last 24 hours that should be:

| index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-48h@h latest=-24h@h
| dedup SourceIP
|  eval Client=SourceIP
| eval status1="Active"
| join Client type=outer [search index=myindex source=mysource sourcetype=mysourcetype * SourceIP="*" earliest=-24h@h
| dedup SourceIP
| eval Client=SourceIP
| eval status2 = "Active" ]
| eval Status = if(match(status1,status2), "Active", "Not Responding")
| table Client Status
| where Status="Not Responding"

aljohnson_splun
Splunk Employee
Splunk Employee

Have you looked at the diff command?

0 Karma

djconroy
Path Finder

I am looking at it now... theoretically, how would I send the values from the search into the set diff command?

Essentially I would need to pipe 11,000 different values into each subsearch.

Also, I see that the set diff command works for less than 10,000 results.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...