Solved: Using top with Multiple Fields

David · ‎05-31-2011

I feel like there should be an easy answer for this, but that my brain isn't finding it, so hopefully someone can reprieve me.

Suppose I have a log with the processing time for a number of URLs, across a number of servers. I want to toss into a summary index the top 10 longest running URLs per server, so I can later use it in a subsearch for host=foo.

In essence, this could work if top supported it:

MySearch earliest=-1d@d latest=@d | bucket _time span=1d | stats sum(ProcessTime) as ProcessTime by URL, host | top limit=10 labelField=URL ProcessTime by host | stats values(URL) by host

This also feels like something that could work if stats supported it:

MySearch earliest=-1d@d latest=@d | bucket _time span=1d | stats limit=10 sum(ProcessTime) as ProcessTime by URL, host | stats values(URL) by host

How can I do what I'm trying to do?

gkanapathy · ‎05-31-2011

Well, first of all, I will note that if you're using a summary, you should be aware that your daily summary won't aggregate, i.e., having the top 10 for each day in your summary does not let you get the top 10 for, say, a whole week in general.

If you just want what you're asking for, though, a quick way to get this is:

MySearch earliest=-1d@d latest=@d 
| bucket _time span=1d 
| stats sum(ProcessTime) as ProcessTime by URL,host
| sort host,-ProcessTime 
| streamstats global=f current=f window=0
    count by host
| where count < 10
| fields - count

View solution in original post

gkanapathy · ‎05-31-2011

Well, first of all, I will note that if you're using a summary, you should be aware that your daily summary won't aggregate, i.e., having the top 10 for each day in your summary does not let you get the top 10 for, say, a whole week in general.

If you just want what you're asking for, though, a quick way to get this is:

MySearch earliest=-1d@d latest=@d 
| bucket _time span=1d 
| stats sum(ProcessTime) as ProcessTime by URL,host
| sort host,-ProcessTime 
| streamstats global=f current=f window=0
    count by host
| where count < 10
| fields - count

David · ‎09-07-2011

I've turned this comment into a question of its own: http://splunk-base.splunk.com/answers/30247/top-values-by-multiple-fields-with-large-datasets

David · ‎08-31-2011

I'm working on a different scenario for the same issue now, with much greater field variability. What is the upper limit of how many values I can toss at sort | streamstats | where | fields before I start getting failures?

I'm splitting by three fields -- FieldA has 30 options, FieldB has up to 2000 and FieldC has up to 10,000. In the raw data, right now I have about 500,000 different possibilities going into the sort, with the expectation of exceeding 1,000,000 during the lifetime of the app.

Using top with Multiple Fields

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

Are you a member of the Splunk Community?

Using top with Multiple Fields

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

New Release | Splunk Cloud Platform 10.1.2507

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2