Solved: how can i limit search results to pipe to stats?

tpsplunk · ‎09-11-2011

I have a web request log with searchtime extracted fields customer_id, object_id, response_time. I would like to craft a search that finds the top 50 most requested object_ids for a given customer_id and then find the avg of the response times for those 50 obeject_ids. so at the end i should have a table with column headings customer_id, object_id, request_count, average response time.

this search will return the top 50:
customer_id=X | stats values(customer_id) count as "Number of Requests" by object_id | sort -"Number of Requests" | head 50

but if i try to pipe the output to stats avg(response_time) I get an empty table "no results found"

I can get it to work by doing the stats avg(response_time) before the "head", but that seems grossly inefficient since the number of unique object_ids can be large. i'm sure i'm missing something simple!

Ayn · ‎09-11-2011

The most straightforward way of grabbing the stats would be as you suggest at the end of your question, just calculate it all and then single out the top 50 object_id's along with their respective response time averages. A seemingly more efficient way would be to use a subsearch that gets the top 50 object_id's first of all, and then use the results from that search in another search that grabs the average.

* [search customer_id=X | top 50 object_id | fields object_id]
| stats avg(response_time) AS response_average by object_id

However I doubt you'll get any noticeable performance boost by doing this. Splunk will have to retrieve all events to get the object_id count anyway so it might as well calculate the response time average for each object_id while it's at it. The following search should yield the exact same results as the one above so just use the one that feels best and gives the best performance. I'd be curious to hear reports of any speed differences!

customer_id=X | stats count,avg(response_time) AS response_average by object_id 
| sort -count | head 10 | fields - count | sort -response_average

View solution in original post

Ayn · ‎09-11-2011

The most straightforward way of grabbing the stats would be as you suggest at the end of your question, just calculate it all and then single out the top 50 object_id's along with their respective response time averages. A seemingly more efficient way would be to use a subsearch that gets the top 50 object_id's first of all, and then use the results from that search in another search that grabs the average.

* [search customer_id=X | top 50 object_id | fields object_id]
| stats avg(response_time) AS response_average by object_id

However I doubt you'll get any noticeable performance boost by doing this. Splunk will have to retrieve all events to get the object_id count anyway so it might as well calculate the response time average for each object_id while it's at it. The following search should yield the exact same results as the one above so just use the one that feels best and gives the best performance. I'd be curious to hear reports of any speed differences!

customer_id=X | stats count,avg(response_time) AS response_average by object_id 
| sort -count | head 10 | fields - count | sort -response_average

tpsplunk · ‎09-13-2011

i'll let you know if i see any speed differences

how can i limit search results to pipe to stats?

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine

Are you a member of the Splunk Community?

how can i limit search results to pipe to stats?

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Fun with Regular Expression - multiples of nine