Splunk Search

Filter out outliers in stats command

nterry
Path Finder

So I am trying to filter out outliers using the 3 sigma rule across some transactions. My search is as follows:

blah... blah... blah...
| eval durationMins = (duration / 60) 
| appendpipe [stats median(durationMins) as Med avg(durationMins) as Avrg stdev(durationMins) as Stddev max(durationMins) as Maxm] 
| eval threeSigmaLimit = (Avrg + (Stddev * 3))
| where durationMins < threeSigmaLimit

If i remove the appendpipe command and the eval and where stuff at the end, i get a table with the correct values, but I want to store the average and standard deviation, perform the eval statement, and filter out events where the duration of the transaction is greater than threeSigmaLimit on each event...

The above command doesn't work.... I can't even see the Med, Avrg, Stddev, and Maxm fields in the Events tab in verbose mode....

Any ideas?

0 Karma
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

View solution in original post

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

nterry
Path Finder

THANK YOU SO MUCH!!!!! I was banging my head against the wall for hours trying to figure this one out....

nterry
Path Finder

Quick question, do the fields Med, Avrg, etc contain their corresponding stats values after performing the where clause, or before.

If before, would I pipe it to this?

...
| where durationMins < threeSigmaLimit
| stats median(durationMins) as "Median Duration" avg(durationMins) as "Average Duration" stdev(durationMins) as "Standard Deviation" max(durationMins) as "Max Duration"
0 Karma
Get Updates on the Splunk Community!

Fueling your curiosity with new Splunk ILT and eLearning courses

At Splunk Education, we’re driven by curiosity—both ours and yours! That’s why we’re committed to delivering ...

Splunk AI Assistant for SPL 1.1.0 | Now Personalized to Your Environment for Greater ...

Splunk AI Assistant for SPL has transformed how users interact with Splunk, making it easier than ever to ...

Unleash Unified Security and Observability with Splunk Cloud Platform

     Now Available on Microsoft AzureOn Demand Now Step boldly into the AI revolution with enhanced security ...