Knowledge Management

calculate request count and duration in a single summary index

New Member

I'm like to collect two pieces of information from wildfly access logs in a single summary index: the number of average requests per minute by URI and avg/mode/max request duration also by URI.

Here are the pertinent fields logged in each wildfly event:
- _time
- method
- uri
- time_taken
- host

My first query looked like this:

sourcetype=wildfly _logs  |bucket _time span=1m | sistats count request_count avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri host _time

However, this resulted in a lot of noise because uri in its raw form contains unique query strings. I'm only interested in caclulating time_taken stats for generic uris (http://www.example.com/somecontroller/someaction vs http://www.example.com/somecontroller/someaction/?QueryString1=foo)

So I try stripping off the query string portion of uri :

 sourcetype=wildfly _logs  |   rex field=uri "^(?<uri_base_url>.+?)\?"|bucket _time span=1m | sistats count as request_count  avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri_base_url host _time

This doesn't work either b/c request_count is under-counted because of the way I'm stripping off query string.

I know I can achieve what I'm after by splitting this summary search in two queries but it feels like this is something that can be achieved in a single query. Any pointers are appreciated.

0 Karma

SplunkTrust
SplunkTrust

Consider using the URL Toolbox app to parse the uri field for you. It uses an external command rather than rex and probably handles edge cases much better.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!