So I'm trying to build a generic search to show a timeline and standard deviation. Basically, pull the count of events over time (a week?), and show the average max and min, and the actual. So this way I could pump any data into it, and quickly see a sharp increase, or decrease, of traffic, For example, sourcetype=cisco:asa
But, it's not working correctly. It's not outputting data for the Hourly_Standard_Deviation field. Any ideas? Am I even approaching this the correct way?
...
| stats count as hit BY date_hour, date_mday
| stats avg(hit) AS Hourly_Average max(hit) AS Hourly_Max min(hit) AS Hourly_Min stdev(hit) AS Hourly_Standard_Deviation BY date_hour
| eval Hourly_Upper_Average=(Hourly_Average + Hourly_Standard_Deviation)
| eval Hourly_Lower_Average=(Hourly_Average - Hourly_Standard_Deviation)
| eval Hourly_Lower_Average = if(Hourly_Lower_Average<0,0,Hourly_Lower_Average)
| fields - Hourly_Standard_Deviation, Hourly_Max, Hourly_Min, Hourly_Average
Thanks!
In my data, I replaced the first stats with a transaction, then used the counts generated to do the rest.
... | transaction count by src_ip maxspan=5m | stats avg(count) AS Hourly_Average max(count) AS Hourly_Max min(count) AS Hourly_Min stdev(count) AS Hourly_Standard_Deviation BY date_hour, date_mday
From there, the rest is probably fixable if it's not working.
transaction count by src_ip
looks weird, are you sure?
Sorry, there was information given over IRC that I forgot to include here.
That initial transaction part isn't really part of the answer specifically. The piece I neglected to put here was that I didn't have his dataset so "made up my own". I used a transaction to build something out of unfiltered firewall logs that looked a tiny bit like dhorn's "hits" (I assumed to be web hits). I explained in IRC that he would most definitely need to fix that part up first so that he has data grouped properly with respect to what he was defining as "hits." Once that is determined, the rest should follow as above.
The root problem was that how he was doing it the first stats was collapsing all the detail he was trying to pull out in the second stats command. While there is definitely more than one way to skin a cat, I used the first one to come to mind which involved a pattern I use a lot - a transaction then some stats off that bundle of information.
The rest does seem to work fine up to the fields part.
host=10.128.0.3
| transaction count by src_ip maxspan=5m
| stats avg(count) AS Hourly_Average max(count) AS Hourly_Max min(count) AS Hourly_Min stdev(count) AS Hourly_Standard_Deviation BY date_hour, date_mday
| eval Hourly_Upper_Average=(Hourly_Average + Hourly_Standard_Deviation)
| eval Hourly_Lower_Average=(Hourly_Average - Hourly_Standard_Deviation)
| eval Hourly_Lower_Average = if(Hourly_Lower_Average<0,0,Hourly_Lower_Average)
I'd recommend watching the .conf 2013 sessions by Jesse Trucks, found by digging around here:
http://conf.splunk.com/sessions/2013?r=conf_topnav_keynotessessions_2013sessions