Splunk Search

Help writing generic standard deviation query

dhorn
Path Finder

So I'm trying to build a generic search to show a timeline and standard deviation. Basically, pull the count of events over time (a week?), and show the average max and min, and the actual. So this way I could pump any data into it, and quickly see a sharp increase, or decrease, of traffic, For example, sourcetype=cisco:asa

But, it's not working correctly. It's not outputting data for the Hourly_Standard_Deviation field. Any ideas? Am I even approaching this the correct way?

...
| stats count as hit BY date_hour, date_mday 
| stats avg(hit) AS Hourly_Average max(hit) AS Hourly_Max min(hit) AS Hourly_Min stdev(hit) AS Hourly_Standard_Deviation BY date_hour
| eval Hourly_Upper_Average=(Hourly_Average + Hourly_Standard_Deviation) 
| eval Hourly_Lower_Average=(Hourly_Average - Hourly_Standard_Deviation)
| eval Hourly_Lower_Average  = if(Hourly_Lower_Average<0,0,Hourly_Lower_Average) 
| fields - Hourly_Standard_Deviation, Hourly_Max, Hourly_Min, Hourly_Average

Thanks!

Tags (3)
0 Karma

Richfez
SplunkTrust
SplunkTrust

In my data, I replaced the first stats with a transaction, then used the counts generated to do the rest.

...   | transaction count by src_ip maxspan=5m | stats avg(count) AS Hourly_Average max(count) AS Hourly_Max min(count) AS Hourly_Min stdev(count) AS Hourly_Standard_Deviation BY date_hour, date_mday

From there, the rest is probably fixable if it's not working.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

transaction count by src_ip looks weird, are you sure?

Richfez
SplunkTrust
SplunkTrust

Sorry, there was information given over IRC that I forgot to include here.

That initial transaction part isn't really part of the answer specifically. The piece I neglected to put here was that I didn't have his dataset so "made up my own". I used a transaction to build something out of unfiltered firewall logs that looked a tiny bit like dhorn's "hits" (I assumed to be web hits). I explained in IRC that he would most definitely need to fix that part up first so that he has data grouped properly with respect to what he was defining as "hits." Once that is determined, the rest should follow as above.

The root problem was that how he was doing it the first stats was collapsing all the detail he was trying to pull out in the second stats command. While there is definitely more than one way to skin a cat, I used the first one to come to mind which involved a pattern I use a lot - a transaction then some stats off that bundle of information.

0 Karma

Richfez
SplunkTrust
SplunkTrust

The rest does seem to work fine up to the fields part.

host=10.128.0.3 
| transaction count by src_ip maxspan=5m 
| stats avg(count) AS Hourly_Average max(count) AS Hourly_Max min(count) AS Hourly_Min stdev(count) AS Hourly_Standard_Deviation BY date_hour, date_mday  
| eval Hourly_Upper_Average=(Hourly_Average + Hourly_Standard_Deviation)
| eval Hourly_Lower_Average=(Hourly_Average - Hourly_Standard_Deviation)
| eval Hourly_Lower_Average = if(Hourly_Lower_Average<0,0,Hourly_Lower_Average)

I'd recommend watching the .conf 2013 sessions by Jesse Trucks, found by digging around here:
http://conf.splunk.com/sessions/2013?r=conf_topnav_keynotessessions_2013sessions

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!