Splunk Search

How can I keep only n% of results from a search?

nonspecialist
New Member

I have a set of web page performance measurements spanning quite some time, generated by an external monitoring provider. I want to be able to find the mean page performance after removing spikes caused by external factors out of our control, and am thinking along the lines of using a truncated mean as a best measure of central tendency but am having problems with the implementation.

Here's my thinking so far:

  • find all page render times for the past 7 days
  • order by render time
  • remove the top and bottom 2.5%
  • calculate truncated mean from remaining values

I can calculate how many values I should be removing easily, but can't work out how to actually remove them. If there's a better way, I'd love to know it!

My query string (not yet working properly) so far is:

startdaysago=7 monitorid=<foo> | eventstats count(rendertime) as nresults | eval nkeep=nresults-ceil(nresults*0.05) | sort 0 -rendertime | head nkeep

but of course head can't take a parameter that's not an integer.

0 Karma
1 Solution

southeringtonp
Motivator

Have you considered using outlier to get rid of the edge cases?

     http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Outlier


Alternately, how about this:

startdaysago=7 monitorid=<foo> 
| eventstats count(rendertime) as nresults
| eval low_clipping=(nresults*0.025)
| eval high_clipping=nresults-low_clipping
| sort rendertime
| streamstats count as sequence_number
| where sequence_number>low_clipping AND sequence_number<high_clipping

View solution in original post

southeringtonp
Motivator

Have you considered using outlier to get rid of the edge cases?

     http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Outlier


Alternately, how about this:

startdaysago=7 monitorid=<foo> 
| eventstats count(rendertime) as nresults
| eval low_clipping=(nresults*0.025)
| eval high_clipping=nresults-low_clipping
| sort rendertime
| streamstats count as sequence_number
| where sequence_number>low_clipping AND sequence_number<high_clipping

nonspecialist
New Member

Awesome! I hadn't managed to find any reasonable examples of 'where', but that's exactly what I need. Thanks!

0 Karma
Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  &#x1f680; Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...