Splunk Search

How can I keep only n% of results from a search?

nonspecialist
New Member

I have a set of web page performance measurements spanning quite some time, generated by an external monitoring provider. I want to be able to find the mean page performance after removing spikes caused by external factors out of our control, and am thinking along the lines of using a truncated mean as a best measure of central tendency but am having problems with the implementation.

Here's my thinking so far:

  • find all page render times for the past 7 days
  • order by render time
  • remove the top and bottom 2.5%
  • calculate truncated mean from remaining values

I can calculate how many values I should be removing easily, but can't work out how to actually remove them. If there's a better way, I'd love to know it!

My query string (not yet working properly) so far is:

startdaysago=7 monitorid=<foo> | eventstats count(rendertime) as nresults | eval nkeep=nresults-ceil(nresults*0.05) | sort 0 -rendertime | head nkeep

but of course head can't take a parameter that's not an integer.

0 Karma
1 Solution

southeringtonp
Motivator

Have you considered using outlier to get rid of the edge cases?

     http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Outlier


Alternately, how about this:

startdaysago=7 monitorid=<foo> 
| eventstats count(rendertime) as nresults
| eval low_clipping=(nresults*0.025)
| eval high_clipping=nresults-low_clipping
| sort rendertime
| streamstats count as sequence_number
| where sequence_number>low_clipping AND sequence_number<high_clipping

View solution in original post

southeringtonp
Motivator

Have you considered using outlier to get rid of the edge cases?

     http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Outlier


Alternately, how about this:

startdaysago=7 monitorid=<foo> 
| eventstats count(rendertime) as nresults
| eval low_clipping=(nresults*0.025)
| eval high_clipping=nresults-low_clipping
| sort rendertime
| streamstats count as sequence_number
| where sequence_number>low_clipping AND sequence_number<high_clipping

nonspecialist
New Member

Awesome! I hadn't managed to find any reasonable examples of 'where', but that's exactly what I need. Thanks!

0 Karma
Get Updates on the Splunk Community!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...