I have a search to SI index=sec marker=01
sourcetype=cisco_firewall | bin _time span=5m | sistats count by log_level, hostname
When calling the results with
index=sec marker=01 | stats count by log_level, hostname
I get results,,,but :
index=sec marker=01 | stats count(eval(log_level="7")) AS Debugging, count(eval(log_level="6")) AS Information, count(eval(log_level="5")) AS Notification, count(eval(log_level="4")) AS Warning, count(eval(log_level="3")) AS Error, count(eval(log_level="2")) AS Critical, count(eval(log_level="1")) AS Alert, count(eval(log_level="0")) AS Emergency by hostname
isnt,,,am i missing something? if this is not possible how can i "SI" this search?
An SQL-like table pivot should serve as reasonable workaround for this. It's not pretty, but it does work.
index=sec marker=01
| stats count by log_level, host
| eval Debug_Count=if(log_level==7,count,0)
| eval Info_Count=if(log_level==6,count,0)
| eval Notify_Count=if(log_level==5,count,0)
| eval Warning_Count=if(log_level==4,count,0)
| eval Error_Count=if(log_level==3,count,0)
| eval Alert_Count=if(log_level==2,count,0)
| eval Emergency_Count=if(log_level==1,count,0)
| stats max(Debug_Count) as Debugging,
max(Info_Count) as Informational,
max(Notify_Count) as Notification,
max(Warning_Count) as Warning
max(Error_Count) as Error
max(Alert_Count) as Alert
max(Emergency_Count) as Emergency
by host
The general idea is to take the "rows" containing data for each log_level
and pivot them up into "columns". Coming out of the first stats
is a series of counts for each log_level
and host
combination. The series of eval
operations creates new fields ("columns") that either have the actual value from a "row" if (and only if) that "row" contains data of interest in that "column". Otherwise, the value in that row is 0. Then, the second stats
filters down to only the non-zero values in each new field for each host
.
This is a fairly common trick done in SQL with a combination of CASE and GROUP BY. I've just adapted it a bit to Splunk's syntax.
I know it's rather late in the piece, but I just hit the same problem. The problem seems to be that the name of the field that is saved in the summary index is different to the one produced by sistats and consequently what is expected by stats running against the si.
in my case the name generated by sistats was psrsvd_ct_eval(MedianDurationAboveObjective > 0) but the name in the index is psrsvd_ct_eval_MedianDurationAboveObjective__0
So a simple workaround (what must surely be a bug) for me was
index=summary | rename "psrsvd_ct_eval_MedianDurationAboveObjective___0_" as "psrsvd_ct_eval(MedianDurationAboveObjective > 0)" | stats ...
An SQL-like table pivot should serve as reasonable workaround for this. It's not pretty, but it does work.
index=sec marker=01
| stats count by log_level, host
| eval Debug_Count=if(log_level==7,count,0)
| eval Info_Count=if(log_level==6,count,0)
| eval Notify_Count=if(log_level==5,count,0)
| eval Warning_Count=if(log_level==4,count,0)
| eval Error_Count=if(log_level==3,count,0)
| eval Alert_Count=if(log_level==2,count,0)
| eval Emergency_Count=if(log_level==1,count,0)
| stats max(Debug_Count) as Debugging,
max(Info_Count) as Informational,
max(Notify_Count) as Notification,
max(Warning_Count) as Warning
max(Error_Count) as Error
max(Alert_Count) as Alert
max(Emergency_Count) as Emergency
by host
The general idea is to take the "rows" containing data for each log_level
and pivot them up into "columns". Coming out of the first stats
is a series of counts for each log_level
and host
combination. The series of eval
operations creates new fields ("columns") that either have the actual value from a "row" if (and only if) that "row" contains data of interest in that "column". Otherwise, the value in that row is 0. Then, the second stats
filters down to only the non-zero values in each new field for each host
.
This is a fairly common trick done in SQL with a combination of CASE and GROUP BY. I've just adapted it a bit to Splunk's syntax.
this works like a charm!
Oddly enough, if you change the 'sistats' to stats it'll work fine.
In general you want to be careful using the SI commands, and the rule is that if you do sistats count by log_level, hostname
in your summary search, then the first reporting clause off of the summary events should be an exact copy of that "si" search but without the 'si'. ie -- stats count by log_level, hostname
.
As to a full explanation of what can go off the rails when you cross the streams and why, it's a little beyond me. In this case it's so weird that it may just be a bug.
The good news is that there isnt really any reason to use sistats in such a simple use case. The fancy statistics that it preserves don't add anything here.
UPDATE: One thing you'll have to watch out if you take this approach and use stats
instead of sistats
, is that stats count
on the search side will simply count the number of rows, and pay no attention to the fact that those rows themselves have 'count' fields. As such you often have to do things like stats sum(count) as count by host
when you use stats manually in summary indexing. In this particular case with the eval syntax, I'm not sure what the equivalent would be but there is one. Watch out you dont end up with the count of summary rows rather than count of the original events.
Glad i reckon its not my misconfiguring!
Thanks nick