Splunk Search

Find top n in each group

saumitra
Engager

I have a collection of records in [object_name, execution_time] format. I want to gather top 10 (i.e. first 10 in sorted sequence) execution time values for each object.

I could extract execution times grouped by object name by

index=myindex | stats values(execTime) as MaxTime by objectName

But I could not find way to sort the individual groups and getting top 10 in each of them.

Thanks,

[EDIT]

Looks like I did not explain the question well. Actually I have a database for how much time did the web service take to execute, in different calls. (I called web service as 'object_name' above). There are various web services.
e.g. I have web services w1,w2, .., wn. I have following records
[w1,e11] [w1,e12] [w1, e13] ... [w2,e21] [w2,e22] [w2,e23] .... [wn,en1] [wn,en2] [wn,en3] ....
e11 represents the time w1 took in its first call.

Now I want the following
w1: t11,t12, ...,t1n
w2: t21,t22, ...,t2n
...

t11, t12, t1n are top n for w1. Top n are the highest 'n' values of e11,e12,,e13 ... in sorted sequence.
e.g. If e11,e12,e13,... are ,8,8,8,8,10,7,7,7,9,9 then Top two are 10,9.

Now Splunk top function would not work here because I would give me 8,7 which are the most frequent two. I don't want that.

It's like list of students' scores in each subject. I want n highest scoring students for each subject. Hope that now it's clear.

Tags (3)
0 Karma
1 Solution

chris
Motivator

Like Ayn I don't quite understand what you're after. Maybe you could post some sample events with the desired output format.

Maybe one of those works for you:

index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  stats  list(eval(if(count<11,execTime,null()))) as MaxTime  by objectName


index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  where count < 11

View solution in original post

chris
Motivator

Like Ayn I don't quite understand what you're after. Maybe you could post some sample events with the desired output format.

Maybe one of those works for you:

index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  stats  list(eval(if(count<11,execTime,null()))) as MaxTime  by objectName


index=myindex | table execTime,objectName | sort -execTime |  streamstats count by objectName |  where count < 11

View solution in original post

saumitra
Engager

Sorry for delay, After bit tweaking in your last query, following worked for me:
index=myIndex | eval time=execution_time/1000000.0 | table object_name,time | sort -time | streamstats count by object_name | stats list(eval(if(count<11,execution_time,null()))) as Values by object_name
Thanks !!

0 Karma

chris
Motivator

Hm my bad, sorry. I updated the answer. The streamstats part was not correct. This is a search that works on internal splunk data: index=_internal source="metrics.log" largest_size="" | table name,largest_size | sort -largest_size | streamstats count by name | stats list(eval(if(count<11,largest_size,null()))) as Values by name

0 Karma

saumitra
Engager

Tried these, but both gave more than 10 results for every objectName.

0 Karma

Ayn
Legend

Not entirely sure what exact output you're after. top 10 execTime by objectName?

saumitra
Engager

I have edited qn now. Top would not work for me.

0 Karma

saumitra
Engager

Any response, please?

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!