Splunk Search

Filter out outliers in stats command

nterry
Path Finder

So I am trying to filter out outliers using the 3 sigma rule across some transactions. My search is as follows:

blah... blah... blah...
| eval durationMins = (duration / 60) 
| appendpipe [stats median(durationMins) as Med avg(durationMins) as Avrg stdev(durationMins) as Stddev max(durationMins) as Maxm] 
| eval threeSigmaLimit = (Avrg + (Stddev * 3))
| where durationMins < threeSigmaLimit

If i remove the appendpipe command and the eval and where stuff at the end, i get a table with the correct values, but I want to store the average and standard deviation, perform the eval statement, and filter out events where the duration of the transaction is greater than threeSigmaLimit on each event...

The above command doesn't work.... I can't even see the Med, Avrg, Stddev, and Maxm fields in the Events tab in verbose mode....

Any ideas?

0 Karma
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

View solution in original post

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

nterry
Path Finder

THANK YOU SO MUCH!!!!! I was banging my head against the wall for hours trying to figure this one out....

nterry
Path Finder

Quick question, do the fields Med, Avrg, etc contain their corresponding stats values after performing the where clause, or before.

If before, would I pipe it to this?

...
| where durationMins < threeSigmaLimit
| stats median(durationMins) as "Median Duration" avg(durationMins) as "Average Duration" stdev(durationMins) as "Standard Deviation" max(durationMins) as "Max Duration"
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...