Splunk Search

how can i limit search results to pipe to stats?

tpsplunk
Communicator

I have a web request log with searchtime extracted fields customer_id, object_id, response_time. I would like to craft a search that finds the top 50 most requested object_ids for a given customer_id and then find the avg of the response times for those 50 obeject_ids. so at the end i should have a table with column headings customer_id, object_id, request_count, average response time.

this search will return the top 50:
customer_id=X | stats values(customer_id) count as "Number of Requests" by object_id | sort -"Number of Requests" | head 50

but if i try to pipe the output to stats avg(response_time) I get an empty table "no results found"

I can get it to work by doing the stats avg(response_time) before the "head", but that seems grossly inefficient since the number of unique object_ids can be large. i'm sure i'm missing something simple!

Tags (3)
0 Karma
1 Solution

Ayn
Legend

The most straightforward way of grabbing the stats would be as you suggest at the end of your question, just calculate it all and then single out the top 50 object_id's along with their respective response time averages. A seemingly more efficient way would be to use a subsearch that gets the top 50 object_id's first of all, and then use the results from that search in another search that grabs the average.

* [search customer_id=X | top 50 object_id | fields object_id]
| stats avg(response_time) AS response_average by object_id

However I doubt you'll get any noticeable performance boost by doing this. Splunk will have to retrieve all events to get the object_id count anyway so it might as well calculate the response time average for each object_id while it's at it. The following search should yield the exact same results as the one above so just use the one that feels best and gives the best performance. I'd be curious to hear reports of any speed differences!

customer_id=X | stats count,avg(response_time) AS response_average by object_id 
| sort -count | head 10 | fields - count | sort -response_average

View solution in original post

Ayn
Legend

The most straightforward way of grabbing the stats would be as you suggest at the end of your question, just calculate it all and then single out the top 50 object_id's along with their respective response time averages. A seemingly more efficient way would be to use a subsearch that gets the top 50 object_id's first of all, and then use the results from that search in another search that grabs the average.

* [search customer_id=X | top 50 object_id | fields object_id]
| stats avg(response_time) AS response_average by object_id

However I doubt you'll get any noticeable performance boost by doing this. Splunk will have to retrieve all events to get the object_id count anyway so it might as well calculate the response time average for each object_id while it's at it. The following search should yield the exact same results as the one above so just use the one that feels best and gives the best performance. I'd be curious to hear reports of any speed differences!

customer_id=X | stats count,avg(response_time) AS response_average by object_id 
| sort -count | head 10 | fields - count | sort -response_average

tpsplunk
Communicator

i'll let you know if i see any speed differences

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...