I have a table which is in below format
Time ProcessTime1 ProcessTime2 ABC_Count
11/14/2017 100 112 30000
11/15/2017 118 205 30546
11/16/2017 119 121 43000
11/17/2017 141 192 95000
It produces a visualization like below, where ABC_Count is the overlay which is the red line in the below graph.
I want to have stdev plotted in the below graph, and if run the below search query it is giving me zero as stdev
mybase search | stats values(ProcessTime1), values(ProcessTime2), values(ABC_Count), stdev(ProcessTime1), stdev(ProcessTime2) by Time
Basically if just remove the "by Time" in the above search i get stdev calculated properly, but then the visualization is not available.
Any idea how stdev can be plotted in the same graph?
@ashish9433, you can check out Trendline and autoregress commands for moving averages
.
You should also check out Splunk documentation on Advanced Statistics if your end goal is identifying anomalies and outliers.
Finally do check out Splunk's Machine Learning Toolkit app (MLTK) for identifying numerical and categorical outliers. Numerical outliers have examples of Interquartile Range
, Mean Absolute Deviation
and Standard Deviation
using streamstats
command.
As indicated in the comments, the above can be calculated used the streamstats function. Make sure you time is in ascending order, and then use streamstats to apply the stdev function on the fields you want to calculate the running stdev for (the same method applies for any other running aggregates, e.g. mean).
| streamstats stdev(ProcessTime1) AS stdev_ProcessTime1, stdev(ProcessTime2) AS stdev_ProcessTime2
| stats values(ProcessTime1), values(stdev_ProcessTime1), values(ProcessTime2), values(stdev_ProcessTime2), values(ABC_Count) BY Time
@ashish9433, you can check out Trendline and autoregress commands for moving averages
.
You should also check out Splunk documentation on Advanced Statistics if your end goal is identifying anomalies and outliers.
Finally do check out Splunk's Machine Learning Toolkit app (MLTK) for identifying numerical and categorical outliers. Numerical outliers have examples of Interquartile Range
, Mean Absolute Deviation
and Standard Deviation
using streamstats
command.
If I understand your question correctly, the standard deviation for each field will just be a single value. So do you want to plot a flat line across the whole bar chart to show the standard deviation? Or were you hoping to plot the new standard deviation from all past records as you include each new day of data?
I was looking to plot a single line of standard deviation. Basically i was trying using stats command, but later i used streamstats command and i was able to address what i was looking for.
Thanks for your revert @alistairmcdougall
Sounds like you were looking for a running avg()
and running stdev()
or something. Please post what you ended up with, so that later searchers can find it.
@DalJeanis You are correct, i was looking for running stdev(), but was not able to find out how to do that, so ended up with a straight line of stdev.
Do you have any idea how can we get running stdev() plotted for the above scenario?
@ashish9433, maybe you can add your working query and accept the same as answer so that others needing the same/similar use case may get benefited.
with stats i was able to get the result of a tabular data in a single row, however with streamstats the stats are applied consecutively to the aggregated data on a per row level, basically aggregates fields and the applies the stats until that row and appends the stats fields to that row. Good to know, but did not help.