I have a lookup table with an event name with min max thresholds
I need to join this (left on the lookup) with the event count by with null fill on events not present in search
lastly - I need rowwise comparison of event count against min / max and conditional format coloring rows with counts out of band.
I am able to left join the data but I am unable to proceed beyond that as I am not able to reference the attributes for any additional evals
Any help or direction would be greatly appreciate:
| inputlookup bk_lookup.csv
| join type=left left=L right=R where L.alertCode = R.alertCode [search index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| stats count as invokes by alertCode]
| table L.alertCode, R.invokes, L.min, L.max
| fillnull value=0 R.invokes
So, let's start with @woodcock's code and build in SPL.
index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| lookup bk_lookup.csv alertCode ``` output min_threshold max_threshold ```
| stats count as invokes by alertCode min_threshold max_threshold
| eval in_threshold = if(min_threshold > invokes OR invokes > max_threshold, "no", "yes")
This accomplishes the first half of your requirement, namely
Then I would like to do a count on the eventName and check if it is outside the min/max threshold for that particular eventName from the lookup file
(I think eventName is the same as alertCode in your code illustration.) The second half is
Additionally, I would like my count table to display eventCount as "0" and not meeting threshold for eventNames in the look up data that is not available in source events.
You are correct that some sort of join is necessary here. But a simple left join is insufficient. A common trick is to append, then do another stats.
index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| stats count by alertCode
| append
[| inputlookup bk_lookup.csv]
| stats values(count) as invokes values(*_threshold) as *_threshold by alertCode
| eval in_threshold = case(isnull(invokes), "n/a", min_threshold > invokes OR invokes > max_threshold, "no", true(), "yes")
| fillnull invokes value=0 ``` any alertCode not in events are given value 0 ```
When using append + stats with inputlookup, lookup command is often unnecessary because stats groupby takes care of the matching.
Note: The in_threshold calculation is valid only if every alertCode has only one entry in bk_lookup.csv.
Hope this helps.
@woodcock is right. Do not think in terms of join, or any SQL operation. Maybe you can describe the actual use case/application with illustrative data and desired output. Splunk usually has a better way than emulating SQL.
Thank you @yuanliu . I am clear from what @woodcock mentioned in terms of use of avoiding use of joins.
Here's is what I am looking for.
I have external data in lookup with a list of eventName, min_threshold, max_threshold
From splunk source events, I am doing inline rex to extract the eventName field
Then I would like to do a count on the eventName and check if it is outside the min/max threshold for that particular eventName from the lookup file
Additionally, I would like my count table to display eventCount as "0" and not meeting threshold for eventNames in the look up data that is not available in source events.
This is why I was looking at left Join, but even beyond that - I am struggling on how to perform the rowwise comparison on the count stats
So, let's start with @woodcock's code and build in SPL.
index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| lookup bk_lookup.csv alertCode ``` output min_threshold max_threshold ```
| stats count as invokes by alertCode min_threshold max_threshold
| eval in_threshold = if(min_threshold > invokes OR invokes > max_threshold, "no", "yes")
This accomplishes the first half of your requirement, namely
Then I would like to do a count on the eventName and check if it is outside the min/max threshold for that particular eventName from the lookup file
(I think eventName is the same as alertCode in your code illustration.) The second half is
Additionally, I would like my count table to display eventCount as "0" and not meeting threshold for eventNames in the look up data that is not available in source events.
You are correct that some sort of join is necessary here. But a simple left join is insufficient. A common trick is to append, then do another stats.
index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| stats count by alertCode
| append
[| inputlookup bk_lookup.csv]
| stats values(count) as invokes values(*_threshold) as *_threshold by alertCode
| eval in_threshold = case(isnull(invokes), "n/a", min_threshold > invokes OR invokes > max_threshold, "no", true(), "yes")
| fillnull invokes value=0 ``` any alertCode not in events are given value 0 ```
When using append + stats with inputlookup, lookup command is often unnecessary because stats groupby takes care of the matching.
Note: The in_threshold calculation is valid only if every alertCode has only one entry in bk_lookup.csv.
Hope this helps.
Read my answer. Run the searches. It is ALL there. If you need `min()` or `max()` then just add those to the `stats` command.
You are thinking about it wrong (like SQL). See this post for the full nitty-gritty (don't forget the karma):
https://community.splunk.com/t5/Splunk-Search/What-is-the-relation-between-the-Splunk-inner-left-joi...
So in your case, you would start EITHER like this:
index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| lookup bk_lookup.csv alertCode
OR like this:
index=my_index log_group="/my/log/group" "*cache*"
| rex field=event.message "alertCode: (?<alertCode>.*), version: (?<version>.*)"
| inputlookup append=true bk_lookup.csv
| eval sourcetype=coalesce(sourcetype, "bk_lookup")
Then complete like shown in the other post.
DO NOT EVER USE JOIN.