Splunk Search

How do I find rows in a table with more fields, which are NOT EQUAL to any row in table with less fields?

aovsiannikov
Explorer

I.e.

<search1>: ... | table id, f1, f2, f3
<search2>: ... | table id, f1, f2

I need to find all records in <search1> that are not equal to any record in <search2>

If I do something like

<search1> | search NOT [<search2>]

Splunk will not consider record, let's say <record1>: id=someID, f1=1, f2=2, f3=3 in <search1> different from record <record2>: id=someID, f1=1, f2=2 from <search2> , because field f3 will not be presented in the boolean expression generated by the subsearch. So it will not pass <record1> to final recordset, but I need it there.

So what is the best approach there?

0 Karma

aovsiannikov
Explorer

Guys, thanks a lot for replies, I'd like to experiment a little bit before accepting. But I'd like to avoid to relay on exact fieldset in search 1 and 2, I'd like to have some unified approach. Means, I'd like to avoid using f3 explicitly in search. Let's imagine that fieldset for search 1 is defined in some variable, I don't know exact fieldset, it's configurable, I don't want to parse it. So the spl should be independent from what is in configuration. Also I have no _raw field, both search are came from lookup.

Now all I can imagine is to search index1 twice, something like this:
index=index1 | search NOT [index=index2 | search [index=index1]]

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi aovsiannikov,
if results in search2 are less than 50,000 you can use a subsearch like the following

index=index1 NOT [ search index=index2 | fields id f1 f2 f3]
| ...

where id, f1, f2, f3 are the fields to compare results from the searches.
If you want to compare the full record you can use as row the field _raw.

If instead results in search2 are more than 50,000, you have to use a different approach, something like this:

index=index1 OR index=index2
| stats count BY index id f1 f2 f3
| where index=index1 AND count=1

if instead you want to compare the full raw you can run something like this:

index=index1 OR index=index2
| stats count BY index _raw
| where index=index1 AND count=1

Bye.
Giuseppe

FrankVl
Ultra Champion

Assuming that id is actually an identifier of a record, it could be as simple as this, right?

search1 OR search2
| evenstats count by id
| search f3=* count=1

(the f3=* is to return the rows belonging to search1. if f3 is not always populated for those items, then use some other characterizing property of the items that came from search1).

If you also need to take f1 and f2 into account for matching items between the two data sets:

search1 OR search2
| evenstats count by id,f1,f2
| search f3=* count=1
0 Karma

briancronrath
Contributor

Assuming these tables aren't over 50k rows (default max of a join subsearch) you could join the results together, and do some eval tricks to compare the data and filter it out. So something like:

 <search1>: ... | table id, f1, f2, f3 | join type=left id [ <search2>: ... | table id, f1, f2 | rename f1 as f1_x | rename f2 as f2_x] | eval filter_out=if(f1=f1_x AND f2=f2_x,"filter","keep") | search filter_out=keep
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...