Dashboards & Visualizations

## Calculate Standard Deviation for a table

Communicator

I have a table which is in below format

``````Time         ProcessTime1                           ProcessTime2                  ABC_Count
11/14/2017    100                                     112                             30000
11/15/2017    118                                     205                             30546
11/16/2017    119                                     121                             43000
11/17/2017    141                                     192                             95000
``````

It produces a visualization like below, where ABC_Count is the overlay which is the red line in the below graph.

I want to have stdev plotted in the below graph, and if run the below search query it is giving me zero as stdev

``````mybase search | stats values(ProcessTime1), values(ProcessTime2), values(ABC_Count), stdev(ProcessTime1), stdev(ProcessTime2) by Time
``````

Basically if just remove the "by Time" in the above search i get stdev calculated properly, but then the visualization is not available.

Any idea how stdev can be plotted in the same graph?

Tags (4)
1 Solution
Legend

@ashish9433, you can check out Trendline and autoregress commands for `moving averages`.

You should also check out Splunk documentation on Advanced Statistics if your end goal is identifying anomalies and outliers.
Finally do check out Splunk's Machine Learning Toolkit app (MLTK) for identifying numerical and categorical outliers. Numerical outliers have examples of `Interquartile Range`, `Mean Absolute Deviation` and `Standard Deviation` using `streamstats` command.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
Explorer

As indicated in the comments, the above can be calculated used the streamstats function. Make sure you time is in ascending order, and then use streamstats to apply the stdev function on the fields you want to calculate the running stdev for (the same method applies for any other running aggregates, e.g. mean).

``````| streamstats stdev(ProcessTime1) AS stdev_ProcessTime1, stdev(ProcessTime2) AS stdev_ProcessTime2
| stats values(ProcessTime1), values(stdev_ProcessTime1), values(ProcessTime2), values(stdev_ProcessTime2), values(ABC_Count) BY Time
``````
Legend

@ashish9433, you can check out Trendline and autoregress commands for `moving averages`.

You should also check out Splunk documentation on Advanced Statistics if your end goal is identifying anomalies and outliers.
Finally do check out Splunk's Machine Learning Toolkit app (MLTK) for identifying numerical and categorical outliers. Numerical outliers have examples of `Interquartile Range`, `Mean Absolute Deviation` and `Standard Deviation` using `streamstats` command.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
Explorer

If I understand your question correctly, the standard deviation for each field will just be a single value. So do you want to plot a flat line across the whole bar chart to show the standard deviation? Or were you hoping to plot the new standard deviation from all past records as you include each new day of data?

Communicator

I was looking to plot a single line of standard deviation. Basically i was trying using stats command, but later i used streamstats command and i was able to address what i was looking for.

SplunkTrust

Sounds like you were looking for a running `avg()` and running `stdev()` or something. Please post what you ended up with, so that later searchers can find it.

Communicator

@DalJeanis You are correct, i was looking for running stdev(), but was not able to find out how to do that, so ended up with a straight line of stdev.

Do you have any idea how can we get running stdev() plotted for the above scenario?

Legend

@ashish9433, maybe you can add your working query and accept the same as answer so that others needing the same/similar use case may get benefited.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
New Member

with stats i was able to get the result of a tabular data in a single row, however with streamstats the stats are applied consecutively to the aggregated data on a per row level, basically aggregates fields and the applies the stats until that row and appends the stats fields to that row. Good to know, but did not help.

Get Updates on the Splunk Community!