Splunk Search

Proper Std Dev Generation


I am a newbie. I'd like an another user's opinion of my logic. Is this the proper syntax for generation of std dev? In particular, to show the prior 7 days of the std dev of total, successful and failed logins using a prior 30 day time span? This is crufted logic from several other answers. Thank you.

    [index source sourcetype]  %ASA-auth "AAA user authentication" earliest=-60d@d latest=now 
    | timechart span=60m count as total 
    count(eval(searchmatch("Successful"))) as success 
    count(eval(searchmatch("Rejected"))) as rejected
    | streamstats window=720 stdev(total) as stdevtotal mean(total) as meantotal
    | streamstats window=720 stdev(success) as stdevSuccess mean(success) as meanSuccess
    | streamstats window=720 stdev(rejected) as stdevRejected mean(rejected) as meanRejected
    | eval total = (total - meantotal) / stdevtotal 
    | eval success = (success - meanSuccess) / stdevSuccess
    | eval rejected = (rejected - meanRejected) / stdevRejected 
    | eval three = 3
    | fields _time total success rejected three
    | where _time > relative_time(now(),"-7d@d")
Tags (2)
0 Karma

Re: Proper Std Dev Generation

Path Finder

Hi bnafziger,

I'm not sure of your ultimate goal, but it looks as if you want to find "outliers" in events based upon "normal" historical behavior. The problem with using just plain old "average" and "std deviation" is that if the "model" of the data isn't a perfect Gaussian distribution (i.e. a "bell curve") , then using avg and stdev will yield misleading results.

An alternative approach is to do what the Prelert Anomaly Detective app does, which uses dynamic modeling to "fit" the data better and you'll get more accurate results. And, it's easy to use!

Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.