<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic calculate request count and duration in a single summary index in Knowledge Management</title>
    <link>https://community.splunk.com/t5/Knowledge-Management/calculate-request-count-and-duration-in-a-single-summary-index/m-p/496772#M4455</link>
    <description>&lt;P&gt;I'm like to collect two pieces of information from  wildfly access logs in a single summary index: the number of average requests per minute by URI &lt;STRONG&gt;and&lt;/STRONG&gt; avg/mode/max request duration also by URI.  &lt;/P&gt;

&lt;P&gt;Here are the pertinent fields logged in each wildfly event:&lt;BR /&gt;
- _time&lt;BR /&gt;
- method&lt;BR /&gt;
- uri &lt;BR /&gt;
- time_taken &lt;BR /&gt;
- host&lt;/P&gt;

&lt;P&gt;My first query looked like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=wildfly _logs  |bucket _time span=1m | sistats count request_count avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri host _time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, this resulted in a lot of noise because uri in its raw form contains unique query strings.  I'm only interested in caclulating time_taken stats for generic uris (&lt;A href="http://www.example.com/somecontroller/someaction" target="_blank"&gt;http://www.example.com/somecontroller/someaction&lt;/A&gt; vs &lt;A href="http://www.example.com/somecontroller/someaction/?QueryString1=foo" target="_blank"&gt;http://www.example.com/somecontroller/someaction/?QueryString1=foo&lt;/A&gt;)&lt;/P&gt;

&lt;P&gt;So I try stripping off  the query string portion of uri :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; sourcetype=wildfly _logs  |   rex field=uri "^(?&amp;lt;uri_base_url&amp;gt;.+?)\?"|bucket _time span=1m | sistats count as request_count  avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri_base_url host _time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This doesn't work either b/c request_count is under-counted because of the way I'm stripping off query string.&lt;/P&gt;

&lt;P&gt;I know I can achieve what I'm after by splitting this summary search in two queries but it &lt;STRONG&gt;feels&lt;/STRONG&gt; like this is something that can be achieved in a single query.  Any pointers are appreciated.&lt;/P&gt;</description>
    <pubDate>Wed, 30 Sep 2020 02:27:39 GMT</pubDate>
    <dc:creator>badtakemonger</dc:creator>
    <dc:date>2020-09-30T02:27:39Z</dc:date>
    <item>
      <title>calculate request count and duration in a single summary index</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/calculate-request-count-and-duration-in-a-single-summary-index/m-p/496772#M4455</link>
      <description>&lt;P&gt;I'm like to collect two pieces of information from  wildfly access logs in a single summary index: the number of average requests per minute by URI &lt;STRONG&gt;and&lt;/STRONG&gt; avg/mode/max request duration also by URI.  &lt;/P&gt;

&lt;P&gt;Here are the pertinent fields logged in each wildfly event:&lt;BR /&gt;
- _time&lt;BR /&gt;
- method&lt;BR /&gt;
- uri &lt;BR /&gt;
- time_taken &lt;BR /&gt;
- host&lt;/P&gt;

&lt;P&gt;My first query looked like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=wildfly _logs  |bucket _time span=1m | sistats count request_count avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri host _time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, this resulted in a lot of noise because uri in its raw form contains unique query strings.  I'm only interested in caclulating time_taken stats for generic uris (&lt;A href="http://www.example.com/somecontroller/someaction" target="_blank"&gt;http://www.example.com/somecontroller/someaction&lt;/A&gt; vs &lt;A href="http://www.example.com/somecontroller/someaction/?QueryString1=foo" target="_blank"&gt;http://www.example.com/somecontroller/someaction/?QueryString1=foo&lt;/A&gt;)&lt;/P&gt;

&lt;P&gt;So I try stripping off  the query string portion of uri :&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; sourcetype=wildfly _logs  |   rex field=uri "^(?&amp;lt;uri_base_url&amp;gt;.+?)\?"|bucket _time span=1m | sistats count as request_count  avg(time_taken) max(time_taken) mode(time_taken) median(time_taken) by  uri_base_url host _time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This doesn't work either b/c request_count is under-counted because of the way I'm stripping off query string.&lt;/P&gt;

&lt;P&gt;I know I can achieve what I'm after by splitting this summary search in two queries but it &lt;STRONG&gt;feels&lt;/STRONG&gt; like this is something that can be achieved in a single query.  Any pointers are appreciated.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 02:27:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/calculate-request-count-and-duration-in-a-single-summary-index/m-p/496772#M4455</guid>
      <dc:creator>badtakemonger</dc:creator>
      <dc:date>2020-09-30T02:27:39Z</dc:date>
    </item>
    <item>
      <title>Re: calculate request count and duration in a single summary index</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/calculate-request-count-and-duration-in-a-single-summary-index/m-p/496773#M4456</link>
      <description>&lt;P&gt;Consider using the URL Toolbox app to parse the uri field for you.  It uses an external command rather than &lt;CODE&gt;rex&lt;/CODE&gt; and probably handles edge cases much better.&lt;/P&gt;</description>
      <pubDate>Wed, 02 Oct 2019 13:06:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/calculate-request-count-and-duration-in-a-single-summary-index/m-p/496773#M4456</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2019-10-02T13:06:52Z</dc:date>
    </item>
  </channel>
</rss>

