Splunk Search

Compare search results with a lookup table and identify unobserved matches

DEAD_BEEF
Builder

I have a query that shows observed category of domains (search engines, social media, streaming, etc.). I'd like to compare this to a lookup table that lists ALL possible categories. From here I'd like to finally generate a table showing the categories that are on the lookup table but NOT observed in the initial query.

Basically, generate a table that shows all unobserved web categories.

Observed Categories

index=web | dedup category | table category

Lookup Table

| inputlookup=all_categories.csv

Current Query

index=web | dedup category | table category | sort +category | join type=outer [inputlookup all_categories.csv]

I tried running a query for yesterday and I'm getting the same results for a query of observed yesterday and my lookup table comparison query (Current Query).

0 Karma
1 Solution

somesoni2
Revered Legend

Try like this. This should give you all the categories which are in lookup but not returned by index=web query)

index=web | stats count by category | table category | eval Observed=1
| append [|inputlookup all_categories.csv | table category  | eval Observed=0 ]
| stats max(Observed) as Observed by category | where Observed=0

View solution in original post

somesoni2
Revered Legend

Try like this. This should give you all the categories which are in lookup but not returned by index=web query)

index=web | stats count by category | table category | eval Observed=1
| append [|inputlookup all_categories.csv | table category  | eval Observed=0 ]
| stats max(Observed) as Observed by category | where Observed=0

DEAD_BEEF
Builder

This works exactly how I wanted! Thank you!

Would you mind explaining how the Observed works in this query? I understand the logic of the query, but I'm confused how you can eval the Observed and somehow the lookup table assigned the '0' to each category while the first line associates the '1' value to the categories.

Is it basically making a second column called "Observed" where the value is '1' for the first part and '0' for the append portion? The stats max is also a bit confusing because if Observed = '1' then wouldn't it just return nothing since the lookup table made every category = '0' ? I'm sorry for all the questions, I'm trying to learn.

0 Karma

somesoni2
Revered Legend

So you get a result set (list of categories) from index=web. We added a new field Observed with value 1 so that all categories that appeared in index=web will have Observed=1 (or true). Using numeric value for easier comparison. The we append 2nd result set, which is all categories from your lookup with a field Observed with value 0 (say Observed=0 means they are from Lookup table only). Since we append two result sets, there can be two entries for a category (one from index=web and one from lookup) so we add the stats command. The final stats checks what's the max value of Observed column. So if
1) Observed=1, means the category was available in index=web. It may or may not be available in lookup but since you say your lookup contains all the categories, Observed=1 means the category appeared in both index=web and lookup table.
2) Observed=0, means the category only appeared in lookup table.

0 Karma

DEAD_BEEF
Builder

This makes much more sense to me now. I was thinking it made two sets of lists, web with 1 and all with 0, but with max, it will sort and show only what I want. Again, thank you for further elaborating.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...