Splunk Search

How can I keep only n% of results from a search?

nonspecialist
New Member

I have a set of web page performance measurements spanning quite some time, generated by an external monitoring provider. I want to be able to find the mean page performance after removing spikes caused by external factors out of our control, and am thinking along the lines of using a truncated mean as a best measure of central tendency but am having problems with the implementation.

Here's my thinking so far:

  • find all page render times for the past 7 days
  • order by render time
  • remove the top and bottom 2.5%
  • calculate truncated mean from remaining values

I can calculate how many values I should be removing easily, but can't work out how to actually remove them. If there's a better way, I'd love to know it!

My query string (not yet working properly) so far is:

startdaysago=7 monitorid=<foo> | eventstats count(rendertime) as nresults | eval nkeep=nresults-ceil(nresults*0.05) | sort 0 -rendertime | head nkeep

but of course head can't take a parameter that's not an integer.

0 Karma
1 Solution

southeringtonp
Motivator

Have you considered using outlier to get rid of the edge cases?

     http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Outlier


Alternately, how about this:

startdaysago=7 monitorid=<foo> 
| eventstats count(rendertime) as nresults
| eval low_clipping=(nresults*0.025)
| eval high_clipping=nresults-low_clipping
| sort rendertime
| streamstats count as sequence_number
| where sequence_number>low_clipping AND sequence_number<high_clipping

View solution in original post

southeringtonp
Motivator

Have you considered using outlier to get rid of the edge cases?

     http://www.splunk.com/base/Documentation/4.1.5/SearchReference/Outlier


Alternately, how about this:

startdaysago=7 monitorid=<foo> 
| eventstats count(rendertime) as nresults
| eval low_clipping=(nresults*0.025)
| eval high_clipping=nresults-low_clipping
| sort rendertime
| streamstats count as sequence_number
| where sequence_number>low_clipping AND sequence_number<high_clipping

nonspecialist
New Member

Awesome! I hadn't managed to find any reasonable examples of 'where', but that's exactly what I need. Thanks!

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...

Customer success is front and center at .conf25

Hi Splunkers, If you are not able to be at .conf25 in person, you can still learn about all the latest news ...