Splunk Search

Compare search results with a lookup table and identify unobserved matches

DEAD_BEEF
Builder

I have a query that shows observed category of domains (search engines, social media, streaming, etc.). I'd like to compare this to a lookup table that lists ALL possible categories. From here I'd like to finally generate a table showing the categories that are on the lookup table but NOT observed in the initial query.

Basically, generate a table that shows all unobserved web categories.

Observed Categories

index=web | dedup category | table category

Lookup Table

| inputlookup=all_categories.csv

Current Query

index=web | dedup category | table category | sort +category | join type=outer [inputlookup all_categories.csv]

I tried running a query for yesterday and I'm getting the same results for a query of observed yesterday and my lookup table comparison query (Current Query).

0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Try like this. This should give you all the categories which are in lookup but not returned by index=web query)

index=web | stats count by category | table category | eval Observed=1
| append [|inputlookup all_categories.csv | table category  | eval Observed=0 ]
| stats max(Observed) as Observed by category | where Observed=0

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Try like this. This should give you all the categories which are in lookup but not returned by index=web query)

index=web | stats count by category | table category | eval Observed=1
| append [|inputlookup all_categories.csv | table category  | eval Observed=0 ]
| stats max(Observed) as Observed by category | where Observed=0

DEAD_BEEF
Builder

This works exactly how I wanted! Thank you!

Would you mind explaining how the Observed works in this query? I understand the logic of the query, but I'm confused how you can eval the Observed and somehow the lookup table assigned the '0' to each category while the first line associates the '1' value to the categories.

Is it basically making a second column called "Observed" where the value is '1' for the first part and '0' for the append portion? The stats max is also a bit confusing because if Observed = '1' then wouldn't it just return nothing since the lookup table made every category = '0' ? I'm sorry for all the questions, I'm trying to learn.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

So you get a result set (list of categories) from index=web. We added a new field Observed with value 1 so that all categories that appeared in index=web will have Observed=1 (or true). Using numeric value for easier comparison. The we append 2nd result set, which is all categories from your lookup with a field Observed with value 0 (say Observed=0 means they are from Lookup table only). Since we append two result sets, there can be two entries for a category (one from index=web and one from lookup) so we add the stats command. The final stats checks what's the max value of Observed column. So if
1) Observed=1, means the category was available in index=web. It may or may not be available in lookup but since you say your lookup contains all the categories, Observed=1 means the category appeared in both index=web and lookup table.
2) Observed=0, means the category only appeared in lookup table.

0 Karma

DEAD_BEEF
Builder

This makes much more sense to me now. I was thinking it made two sets of lists, web with 1 and all with 0, but with max, it will sort and show only what I want. Again, thank you for further elaborating.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...