Splunk Dev

Keeping highest 99% of values for each field value

jrnastase
Explorer

Hello all. I have calculated measures of a given statistic for a variety of values for the field "Link", and I need to keep the top 99% of values for each Link, and then find the average/minimum of what is left over. Any idea how to do that?

I tried sorting, then dedup X Link to return the top X values, but the problem is each link has a different number of points. Any help would be greatly appreciated!

Tags (1)
0 Karma

DalJeanis
Legend

Yes, you've got it precisely. It's not possible to eliminate the bottom 1% without passing the file, so eventstats is required. Then you have to pass the file again to get the new average.

In other contexts, you can look at outlier for a one-step cleaning command that defaults to get move inward everything that is outside of 2.5x the interquartile range.

jrnastase
Explorer

Not sure if the most efficient solution but here's what I have so far that seems to work...

| eventstats perc01(statistic) as statistic_01p BY Field
| where statistic >= statistic_01p
| stats avg(statistic), min(statistic) BY Field

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...