Splunk Search

Filter out outliers in stats command

nterry
Path Finder

So I am trying to filter out outliers using the 3 sigma rule across some transactions. My search is as follows:

blah... blah... blah...
| eval durationMins = (duration / 60) 
| appendpipe [stats median(durationMins) as Med avg(durationMins) as Avrg stdev(durationMins) as Stddev max(durationMins) as Maxm] 
| eval threeSigmaLimit = (Avrg + (Stddev * 3))
| where durationMins < threeSigmaLimit

If i remove the appendpipe command and the eval and where stuff at the end, i get a table with the correct values, but I want to store the average and standard deviation, perform the eval statement, and filter out events where the duration of the transaction is greater than threeSigmaLimit on each event...

The above command doesn't work.... I can't even see the Med, Avrg, Stddev, and Maxm fields in the Events tab in verbose mode....

Any ideas?

0 Karma
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

View solution in original post

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

nterry
Path Finder

THANK YOU SO MUCH!!!!! I was banging my head against the wall for hours trying to figure this one out....

nterry
Path Finder

Quick question, do the fields Med, Avrg, etc contain their corresponding stats values after performing the where clause, or before.

If before, would I pipe it to this?

...
| where durationMins < threeSigmaLimit
| stats median(durationMins) as "Median Duration" avg(durationMins) as "Average Duration" stdev(durationMins) as "Standard Deviation" max(durationMins) as "Max Duration"
0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...