Splunk Search

How to subset top N records from the number generated from eventstats

zztc2004
Explorer

Hi Splunk friends,

I am new to Splunk community and currently facing a question.
I have below table which was generated by some raw log-line data .
stats2 is actually the aggregated sum of stats1 group by ID.

ID  stat1   stats2(eventstats sum of stat1 by ID)
1   1          6
1   2          6
1   3          6
2   4          9
2   5          9
3   6          21
3   7          21
3   8          21
4   9          10
4   0          10
4   1          10

What I am looking for is, returning the subset of below table and only pick top N =2 in terms of stats2, for example:

ID  stat1   stats2(eventstats sum of stat1 by ID)
3      6       21
3      7       21
3.     8       21
4      9       10
4      0       10
4      1       10

I tried several methods that all failed, and I do not want to leverage join statement, which is not efficient in Splunk.

Thanks so much for the help.
Jay

0 Karma
1 Solution

DalJeanis
Legend

Here's one way

  your query
 | appendpipe [| dedup stats2 | sort 2 - stats2 | table stats2 | eval keepme="Y"]
 | eventstats values(keepme) as keepme by stats2 
 | where isnotnull(keepme) and isnotnull(stats1)

Of course, it would be more efficient to, instead of | eventstats to calculate stats2, to do the appendpipe strategy and summarize by ID directly, keeping the top 2.

View solution in original post

zztc2004
Explorer

I figured out another way to realize my need.
Basically, by leveraging the "sub-search" to extract and return the orgID with the top N value of stat2, and use the result of sub-search as the filter criteria to the primary search query.
For sub-search, pls refer here: http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Search/Aboutsubsearches

0 Karma

kyaparla
Path Finder

if I understand it correctly, you want rows with top 2 stats2 values right?

some thing like this should work. replace regexes or eval x= expression based on your data.

| eval x=ID+":"+stat1 | stats list(x) as x by stats2 | sort - stats2 | head 2 | mvexpand x | rex field=x "(?<ID>[^\:]+)\:(?<stat1>.+)" | fields - x

DalJeanis
Legend

@kayaparla - nice. That would work.

0 Karma

DalJeanis
Legend

Here's one way

  your query
 | appendpipe [| dedup stats2 | sort 2 - stats2 | table stats2 | eval keepme="Y"]
 | eventstats values(keepme) as keepme by stats2 
 | where isnotnull(keepme) and isnotnull(stats1)

Of course, it would be more efficient to, instead of | eventstats to calculate stats2, to do the appendpipe strategy and summarize by ID directly, keeping the top 2.

zztc2004
Explorer

Hi @DalJeanis,

Any idea, why I got error when I try to use sort within appendpipe[]?
So confused....

Jay

0 Karma

zztc2004
Explorer

Thanks, this appendpipe is very helpful for my case!

Jay

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Thanks for the Memories! Splunk University, .conf25, and our Community

Thank you to everyone in the Splunk Community who joined us for .conf25, which kicked off with our iconic ...

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Introducing Splunk 10.0: Smarter, Faster, and More Powerful Than Ever

Now On Demand Whether you're managing complex deployments or looking to future-proof your data ...