Hello all. I have calculated measures of a given statistic for a variety of values for the field "Link", and I need to keep the top 99% of values for each Link, and then find the average/minimum of what is left over. Any idea how to do that?
I tried sorting, then dedup X Link to return the top X values, but the problem is each link has a different number of points. Any help would be greatly appreciated!
Yes, you've got it precisely. It's not possible to eliminate the bottom 1% without passing the file, so eventstats is required. Then you have to pass the file again to get the new average.
In other contexts, you can look at outlier
for a one-step cleaning command that defaults to get move inward everything that is outside of 2.5x the interquartile range.
Not sure if the most efficient solution but here's what I have so far that seems to work...
| eventstats perc01(statistic) as statistic_01p BY Field
| where statistic >= statistic_01p
| stats avg(statistic), min(statistic) BY Field