Splunk Search

Filter out outliers in stats command

nterry
Path Finder

So I am trying to filter out outliers using the 3 sigma rule across some transactions. My search is as follows:

blah... blah... blah...
| eval durationMins = (duration / 60) 
| appendpipe [stats median(durationMins) as Med avg(durationMins) as Avrg stdev(durationMins) as Stddev max(durationMins) as Maxm] 
| eval threeSigmaLimit = (Avrg + (Stddev * 3))
| where durationMins < threeSigmaLimit

If i remove the appendpipe command and the eval and where stuff at the end, i get a table with the correct values, but I want to store the average and standard deviation, perform the eval statement, and filter out events where the duration of the transaction is greater than threeSigmaLimit on each event...

The above command doesn't work.... I can't even see the Med, Avrg, Stddev, and Maxm fields in the Events tab in verbose mode....

Any ideas?

0 Karma
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

View solution in original post

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

nterry
Path Finder

THANK YOU SO MUCH!!!!! I was banging my head against the wall for hours trying to figure this one out....

nterry
Path Finder

Quick question, do the fields Med, Avrg, etc contain their corresponding stats values after performing the where clause, or before.

If before, would I pipe it to this?

...
| where durationMins < threeSigmaLimit
| stats median(durationMins) as "Median Duration" avg(durationMins) as "Average Duration" stdev(durationMins) as "Standard Deviation" max(durationMins) as "Max Duration"
0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud | Unified Identity - Now Available for Existing Splunk ...

Raise your hand if you’ve already forgotten your username or password when logging into an account. (We can’t ...

Index This | How many sides does a circle have?

February 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Registration for Splunk University is Now Open!

Are you ready for an adventure in learning?   Brace yourselves because Splunk University is back, and it's ...