Dashboards & Visualizations

Calculate Standard Deviation for a table

ashish9433
Communicator

I have a table which is in below format

Time         ProcessTime1                           ProcessTime2                  ABC_Count
11/14/2017    100                                     112                             30000    
11/15/2017    118                                     205                             30546
11/16/2017    119                                     121                             43000
11/17/2017    141                                     192                             95000     

It produces a visualization like below, where ABC_Count is the overlay which is the red line in the below graph.

alt text

I want to have stdev plotted in the below graph, and if run the below search query it is giving me zero as stdev

mybase search | stats values(ProcessTime1), values(ProcessTime2), values(ABC_Count), stdev(ProcessTime1), stdev(ProcessTime2) by Time

Basically if just remove the "by Time" in the above search i get stdev calculated properly, but then the visualization is not available.

Any idea how stdev can be plotted in the same graph?

0 Karma
1 Solution

niketn
Legend

@ashish9433, you can check out Trendline and autoregress commands for moving averages.

You should also check out Splunk documentation on Advanced Statistics if your end goal is identifying anomalies and outliers.
Finally do check out Splunk's Machine Learning Toolkit app (MLTK) for identifying numerical and categorical outliers. Numerical outliers have examples of Interquartile Range, Mean Absolute Deviation and Standard Deviation using streamstats command.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

View solution in original post

alistairmcdouga
Explorer

As indicated in the comments, the above can be calculated used the streamstats function. Make sure you time is in ascending order, and then use streamstats to apply the stdev function on the fields you want to calculate the running stdev for (the same method applies for any other running aggregates, e.g. mean).

| streamstats stdev(ProcessTime1) AS stdev_ProcessTime1, stdev(ProcessTime2) AS stdev_ProcessTime2
| stats values(ProcessTime1), values(stdev_ProcessTime1), values(ProcessTime2), values(stdev_ProcessTime2), values(ABC_Count) BY Time
0 Karma

niketn
Legend

@ashish9433, you can check out Trendline and autoregress commands for moving averages.

You should also check out Splunk documentation on Advanced Statistics if your end goal is identifying anomalies and outliers.
Finally do check out Splunk's Machine Learning Toolkit app (MLTK) for identifying numerical and categorical outliers. Numerical outliers have examples of Interquartile Range, Mean Absolute Deviation and Standard Deviation using streamstats command.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

alistairmcdouga
Explorer

If I understand your question correctly, the standard deviation for each field will just be a single value. So do you want to plot a flat line across the whole bar chart to show the standard deviation? Or were you hoping to plot the new standard deviation from all past records as you include each new day of data?

0 Karma

ashish9433
Communicator

I was looking to plot a single line of standard deviation. Basically i was trying using stats command, but later i used streamstats command and i was able to address what i was looking for.

Thanks for your revert @alistairmcdougall

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Sounds like you were looking for a running avg() and running stdev() or something. Please post what you ended up with, so that later searchers can find it.

0 Karma

ashish9433
Communicator

@DalJeanis You are correct, i was looking for running stdev(), but was not able to find out how to do that, so ended up with a straight line of stdev.

Do you have any idea how can we get running stdev() plotted for the above scenario?

0 Karma

niketn
Legend

@ashish9433, maybe you can add your working query and accept the same as answer so that others needing the same/similar use case may get benefited.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

dresh
New Member

with stats i was able to get the result of a tabular data in a single row, however with streamstats the stats are applied consecutively to the aggregated data on a per row level, basically aggregates fields and the applies the stats until that row and appends the stats fields to that row. Good to know, but did not help.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...