<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Combining URL fields in reporting in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13058#M1136</link>
    <description>&lt;P&gt;Sadly, our site URLs have pretty wide variations in format and that's not going to work for me.&lt;/P&gt;</description>
    <pubDate>Fri, 07 May 2010 08:14:36 GMT</pubDate>
    <dc:creator>mikebrittain</dc:creator>
    <dc:date>2010-05-07T08:14:36Z</dc:date>
    <item>
      <title>Combining URL fields in reporting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13055#M1133</link>
      <description>&lt;P&gt;I'm trying to build a report of slowest pages/scripts on our server based on times for serving those scripts.  This will help us track down our worst performing scripts so we can do a bit of performance tuning.&lt;/P&gt;

&lt;P&gt;The search I'm using looks like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=".../access.log" | stats avg(response_time) by script_path | sort avg(response_time) desc
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The problem with this report is that the top script paths listed include unique IDs, something like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/view/item/12345
/view/item/12346
/view/item/12347
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I was thinking I could group these together by doing a regex on script_path to replace the digit portion with a single "0" so that the average of response_time is based on all of the similar URLs.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/view/item/0
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Having trouble with the search syntax.  Any help?&lt;/P&gt;</description>
      <pubDate>Fri, 07 May 2010 02:40:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13055#M1133</guid>
      <dc:creator>mikebrittain</dc:creator>
      <dc:date>2010-05-07T02:40:47Z</dc:date>
    </item>
    <item>
      <title>Re: Combining URL fields in reporting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13056#M1134</link>
      <description>&lt;P&gt;Perhaps you could generalize with field?  I don't know if it matches your data, but when I come across something that looks like http:/url/path/here&amp;amp;some_junk&amp;amp;12345&amp;amp;blahblahblah, I often create a field that only extracts the http:/url/path/here so I can use that to report upon.  Make sense?&lt;/P&gt;</description>
      <pubDate>Fri, 07 May 2010 03:35:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13056#M1134</guid>
      <dc:creator>bfaber</dc:creator>
      <dc:date>2010-05-07T03:35:40Z</dc:date>
    </item>
    <item>
      <title>Re: Combining URL fields in reporting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13057#M1135</link>
      <description>&lt;P&gt;For quick and dirty processing, use an inline regex via the &lt;CODE&gt;rex&lt;/CODE&gt; command.  For example, if your URI path structure, in the field named &lt;CODE&gt;script_path&lt;/CODE&gt; is usually something like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;/&amp;lt;group&amp;gt;/&amp;lt;class&amp;gt;/&amp;lt;object_id&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;where you want to generate statistics based on &lt;CODE&gt;/&amp;lt;group&amp;gt;/&amp;lt;class&amp;gt;&lt;/CODE&gt; and not &lt;CODE&gt;&amp;lt;object_id&amp;gt;&lt;/CODE&gt;, then add:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=".../access.log" | rex field=script_path "(?&amp;lt;script_class&amp;gt;(/[^/]+){1,2})"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;to your search string.  This will generate a new field called &lt;CODE&gt;script_class&lt;/CODE&gt; that is only the first 2 segments of your URI path.  You can then operate on &lt;CODE&gt;script_class&lt;/CODE&gt; just like any other field, so to complete your original search string:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=".../access.log" 
| rex field=script_path "(?&amp;lt;script_class&amp;gt;(/[^/]+){1,2})"
| stats avg(response_time) by script_class 
| sort avg(response_time) desc
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You probably don't want to type this in every time you search, so you can add this permanently to your app via transforms so the field &lt;CODE&gt;script_class&lt;/CODE&gt; is automatically extracted.&lt;/P&gt;</description>
      <pubDate>Fri, 07 May 2010 04:58:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13057#M1135</guid>
      <dc:creator>Johnvey</dc:creator>
      <dc:date>2010-05-07T04:58:09Z</dc:date>
    </item>
    <item>
      <title>Re: Combining URL fields in reporting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13058#M1136</link>
      <description>&lt;P&gt;Sadly, our site URLs have pretty wide variations in format and that's not going to work for me.&lt;/P&gt;</description>
      <pubDate>Fri, 07 May 2010 08:14:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13058#M1136</guid>
      <dc:creator>mikebrittain</dc:creator>
      <dc:date>2010-05-07T08:14:36Z</dc:date>
    </item>
    <item>
      <title>Re: Combining URL fields in reporting</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13059#M1137</link>
      <description>&lt;P&gt;This is a good start. Unfortunately, most of our URLs are not this standardized.&lt;/P&gt;

&lt;P&gt;It looks like "rex" will work using mode=sed.&lt;/P&gt;</description>
      <pubDate>Fri, 07 May 2010 09:50:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Combining-URL-fields-in-reporting/m-p/13059#M1137</guid>
      <dc:creator>mikebrittain</dc:creator>
      <dc:date>2010-05-07T09:50:02Z</dc:date>
    </item>
  </channel>
</rss>

