Knowledge Management

calculate request count and duration in a single summary index

badtakemonger
New Member

I'm like to collect two pieces of information from wildfly access logs in a single summary index: the number of average requests per minute by URI and avg/mode/max request duration also by URI.

Here are the pertinent fields logged in each wildfly event:
- _time
- method
- uri
- time_taken
- host

My first query looked like this:

sourcetype=wildfly _logs  |bucket _time span=1m | sistats count request_count avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri host _time

However, this resulted in a lot of noise because uri in its raw form contains unique query strings. I'm only interested in caclulating time_taken stats for generic uris (http://www.example.com/somecontroller/someaction vs http://www.example.com/somecontroller/someaction/?QueryString1=foo)

So I try stripping off the query string portion of uri :

 sourcetype=wildfly _logs  |   rex field=uri "^(?<uri_base_url>.+?)\?"|bucket _time span=1m | sistats count as request_count  avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri_base_url host _time

This doesn't work either b/c request_count is under-counted because of the way I'm stripping off query string.

I know I can achieve what I'm after by splitting this summary search in two queries but it feels like this is something that can be achieved in a single query. Any pointers are appreciated.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Consider using the URL Toolbox app to parse the uri field for you. It uses an external command rather than rex and probably handles edge cases much better.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!