Splunk Search

how can i limit search results to pipe to stats?

Communicator

I have a web request log with searchtime extracted fields customerid, objectid, responsetime. I would like to craft a search that finds the top 50 most requested objectids for a given customerid and then find the avg of the response times for those 50 obejectids. so at the end i should have a table with column headings customerid, objectid, request_count, average response time.

this search will return the top 50:
customerid=X | stats values(customerid) count as "Number of Requests" by object_id | sort -"Number of Requests" | head 50

but if i try to pipe the output to stats avg(response_time) I get an empty table "no results found"

I can get it to work by doing the stats avg(responsetime) before the "head", but that seems grossly inefficient since the number of unique objectids can be large. i'm sure i'm missing something simple!

Tags (3)
0 Karma
1 Solution

Legend

The most straightforward way of grabbing the stats would be as you suggest at the end of your question, just calculate it all and then single out the top 50 object_id's along with their respective response time averages. A seemingly more efficient way would be to use a subsearch that gets the top 50 object_id's first of all, and then use the results from that search in another search that grabs the average.

* [search customer_id=X | top 50 object_id | fields object_id]
| stats avg(response_time) AS response_average by object_id

However I doubt you'll get any noticeable performance boost by doing this. Splunk will have to retrieve all events to get the object_id count anyway so it might as well calculate the response time average for each object_id while it's at it. The following search should yield the exact same results as the one above so just use the one that feels best and gives the best performance. I'd be curious to hear reports of any speed differences!

customer_id=X | stats count,avg(response_time) AS response_average by object_id 
| sort -count | head 10 | fields - count | sort -response_average

View solution in original post

Legend

The most straightforward way of grabbing the stats would be as you suggest at the end of your question, just calculate it all and then single out the top 50 object_id's along with their respective response time averages. A seemingly more efficient way would be to use a subsearch that gets the top 50 object_id's first of all, and then use the results from that search in another search that grabs the average.

* [search customer_id=X | top 50 object_id | fields object_id]
| stats avg(response_time) AS response_average by object_id

However I doubt you'll get any noticeable performance boost by doing this. Splunk will have to retrieve all events to get the object_id count anyway so it might as well calculate the response time average for each object_id while it's at it. The following search should yield the exact same results as the one above so just use the one that feels best and gives the best performance. I'd be curious to hear reports of any speed differences!

customer_id=X | stats count,avg(response_time) AS response_average by object_id 
| sort -count | head 10 | fields - count | sort -response_average

View solution in original post

Communicator

i'll let you know if i see any speed differences

0 Karma