Knowledge Management

Eval limitations on si index with sistats count by field1, field2 etc

Starlette
Contributor

I have a search to SI index=sec marker=01

sourcetype=cisco_firewall | bin _time span=5m | sistats count by log_level, hostname

When calling the results with

index=sec marker=01 | stats count by log_level, hostname

I get results,,,but :

index=sec marker=01  | stats count(eval(log_level="7")) AS Debugging, count(eval(log_level="6")) AS Information, count(eval(log_level="5")) AS Notification, count(eval(log_level="4")) AS Warning, count(eval(log_level="3")) AS Error, count(eval(log_level="2")) AS Critical, count(eval(log_level="1")) AS Alert, count(eval(log_level="0")) AS Emergency by hostname 

isnt,,,am i missing something? if this is not possible how can i "SI" this search?

Tags (2)
1 Solution

dwaddle
SplunkTrust
SplunkTrust

An SQL-like table pivot should serve as reasonable workaround for this. It's not pretty, but it does work.

index=sec marker=01 
| stats count by log_level, host 
| eval Debug_Count=if(log_level==7,count,0) 
| eval Info_Count=if(log_level==6,count,0) 
| eval Notify_Count=if(log_level==5,count,0) 
| eval Warning_Count=if(log_level==4,count,0) 
| eval Error_Count=if(log_level==3,count,0) 
| eval Alert_Count=if(log_level==2,count,0) 
| eval Emergency_Count=if(log_level==1,count,0) 
| stats max(Debug_Count) as Debugging, 
        max(Info_Count) as Informational, 
        max(Notify_Count) as Notification,
        max(Warning_Count) as Warning
        max(Error_Count) as Error
        max(Alert_Count) as Alert
        max(Emergency_Count) as Emergency
        by host

The general idea is to take the "rows" containing data for each log_level and pivot them up into "columns". Coming out of the first stats is a series of counts for each log_level and host combination. The series of eval operations creates new fields ("columns") that either have the actual value from a "row" if (and only if) that "row" contains data of interest in that "column". Otherwise, the value in that row is 0. Then, the second stats filters down to only the non-zero values in each new field for each host.

This is a fairly common trick done in SQL with a combination of CASE and GROUP BY. I've just adapted it a bit to Splunk's syntax.

View solution in original post

andersmholmgren
Explorer

I know it's rather late in the piece, but I just hit the same problem. The problem seems to be that the name of the field that is saved in the summary index is different to the one produced by sistats and consequently what is expected by stats running against the si.

in my case the name generated by sistats was psrsvd_ct_eval(MedianDurationAboveObjective > 0) but the name in the index is psrsvd_ct_eval_MedianDurationAboveObjective__0

So a simple workaround (what must surely be a bug) for me was

index=summary | rename "psrsvd_ct_eval_MedianDurationAboveObjective___0_" as "psrsvd_ct_eval(MedianDurationAboveObjective > 0)" | stats ...
0 Karma

dwaddle
SplunkTrust
SplunkTrust

An SQL-like table pivot should serve as reasonable workaround for this. It's not pretty, but it does work.

index=sec marker=01 
| stats count by log_level, host 
| eval Debug_Count=if(log_level==7,count,0) 
| eval Info_Count=if(log_level==6,count,0) 
| eval Notify_Count=if(log_level==5,count,0) 
| eval Warning_Count=if(log_level==4,count,0) 
| eval Error_Count=if(log_level==3,count,0) 
| eval Alert_Count=if(log_level==2,count,0) 
| eval Emergency_Count=if(log_level==1,count,0) 
| stats max(Debug_Count) as Debugging, 
        max(Info_Count) as Informational, 
        max(Notify_Count) as Notification,
        max(Warning_Count) as Warning
        max(Error_Count) as Error
        max(Alert_Count) as Alert
        max(Emergency_Count) as Emergency
        by host

The general idea is to take the "rows" containing data for each log_level and pivot them up into "columns". Coming out of the first stats is a series of counts for each log_level and host combination. The series of eval operations creates new fields ("columns") that either have the actual value from a "row" if (and only if) that "row" contains data of interest in that "column". Otherwise, the value in that row is 0. Then, the second stats filters down to only the non-zero values in each new field for each host.

This is a fairly common trick done in SQL with a combination of CASE and GROUP BY. I've just adapted it a bit to Splunk's syntax.

Starlette
Contributor

this works like a charm!

0 Karma

sideview
SplunkTrust
SplunkTrust

Oddly enough, if you change the 'sistats' to stats it'll work fine.

In general you want to be careful using the SI commands, and the rule is that if you do sistats count by log_level, hostname in your summary search, then the first reporting clause off of the summary events should be an exact copy of that "si" search but without the 'si'. ie -- stats count by log_level, hostname.

As to a full explanation of what can go off the rails when you cross the streams and why, it's a little beyond me. In this case it's so weird that it may just be a bug.

The good news is that there isnt really any reason to use sistats in such a simple use case. The fancy statistics that it preserves don't add anything here.

UPDATE: One thing you'll have to watch out if you take this approach and use stats instead of sistats, is that stats count on the search side will simply count the number of rows, and pay no attention to the fact that those rows themselves have 'count' fields. As such you often have to do things like stats sum(count) as count by host when you use stats manually in summary indexing. In this particular case with the eval syntax, I'm not sure what the equivalent would be but there is one. Watch out you dont end up with the count of summary rows rather than count of the original events.

Starlette
Contributor

Glad i reckon its not my misconfiguring!
Thanks nick

0 Karma
Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...