Splunk Search

Filter out outliers in stats command

nterry
Path Finder

So I am trying to filter out outliers using the 3 sigma rule across some transactions. My search is as follows:

blah... blah... blah...
| eval durationMins = (duration / 60) 
| appendpipe [stats median(durationMins) as Med avg(durationMins) as Avrg stdev(durationMins) as Stddev max(durationMins) as Maxm] 
| eval threeSigmaLimit = (Avrg + (Stddev * 3))
| where durationMins < threeSigmaLimit

If i remove the appendpipe command and the eval and where stuff at the end, i get a table with the correct values, but I want to store the average and standard deviation, perform the eval statement, and filter out events where the duration of the transaction is greater than threeSigmaLimit on each event...

The above command doesn't work.... I can't even see the Med, Avrg, Stddev, and Maxm fields in the Events tab in verbose mode....

Any ideas?

0 Karma
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

View solution in original post

aljohnson_splun
Splunk Employee
Splunk Employee

Change stats to eventstats. Furthermore, you don't need the appendpipe.

| eval durationMins = (duration/60)
| eventstats median(durationMins) as Med, avg(durationMins) as Avrg, stdev(durationMins) as StDev, max(durationMins) as Maxm
| eval threeSigmaLimit = (Avrg + (StDev * 3))
| where durationMins < threeSigmaLimit

Eventstats command documentation here.

nterry
Path Finder

THANK YOU SO MUCH!!!!! I was banging my head against the wall for hours trying to figure this one out....

nterry
Path Finder

Quick question, do the fields Med, Avrg, etc contain their corresponding stats values after performing the where clause, or before.

If before, would I pipe it to this?

...
| where durationMins < threeSigmaLimit
| stats median(durationMins) as "Median Duration" avg(durationMins) as "Average Duration" stdev(durationMins) as "Standard Deviation" max(durationMins) as "Max Duration"
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...