<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to collapse variable part of url to get a list of urls? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-collapse-variable-part-of-url-to-get-a-list-of-urls/m-p/324119#M96740</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I'm trying to get a list of urls that users are visiting for each of the customer sites that we manage.  &lt;/P&gt;

&lt;P&gt;I have a lookup table that links hosts to a particular customer site.&lt;/P&gt;

&lt;P&gt;I have gotten as far as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| rex field=uri_path mode=sed "s/([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})/email/g"
| lookup host_grouping Host AS host OUTPUTNEW Customer, Environment 
| dedup Customer uri_path 
| fields _time Customer Environment uri_path
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The lookup works just fine, but a lot of my url's have a variable portion that is easily described with regex.  If I try to tabulate the output of this search I end up with thousands of entries.&lt;/P&gt;

&lt;P&gt;For example, the values 28400 and 212 are id's of the channel and stream respectively.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; /api/channel/28400/stream/212/play
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Instead of listing every combination of this endpoint that has been reached I want to count it just once, for example by matching those integers with regex.&lt;/P&gt;

&lt;P&gt;There will be different url formats, but the pieces I need to regex out can be assumed to be either integers or email addresses.&lt;/P&gt;

&lt;P&gt;I think what I'm looking for is Rex (&lt;A href="http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Rex"&gt;http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Rex&lt;/A&gt;) but it doesn't seem to accept PCRE.  &lt;/P&gt;

&lt;P&gt;Any clues where I can find this in the documentation?  Or give an example of "flattening" fields that have predictable variable content?&lt;/P&gt;</description>
    <pubDate>Mon, 09 Apr 2018 15:26:34 GMT</pubDate>
    <dc:creator>andrewbeak</dc:creator>
    <dc:date>2018-04-09T15:26:34Z</dc:date>
    <item>
      <title>How to collapse variable part of url to get a list of urls?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-collapse-variable-part-of-url-to-get-a-list-of-urls/m-p/324119#M96740</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I'm trying to get a list of urls that users are visiting for each of the customer sites that we manage.  &lt;/P&gt;

&lt;P&gt;I have a lookup table that links hosts to a particular customer site.&lt;/P&gt;

&lt;P&gt;I have gotten as far as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| rex field=uri_path mode=sed "s/([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})/email/g"
| lookup host_grouping Host AS host OUTPUTNEW Customer, Environment 
| dedup Customer uri_path 
| fields _time Customer Environment uri_path
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The lookup works just fine, but a lot of my url's have a variable portion that is easily described with regex.  If I try to tabulate the output of this search I end up with thousands of entries.&lt;/P&gt;

&lt;P&gt;For example, the values 28400 and 212 are id's of the channel and stream respectively.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; /api/channel/28400/stream/212/play
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Instead of listing every combination of this endpoint that has been reached I want to count it just once, for example by matching those integers with regex.&lt;/P&gt;

&lt;P&gt;There will be different url formats, but the pieces I need to regex out can be assumed to be either integers or email addresses.&lt;/P&gt;

&lt;P&gt;I think what I'm looking for is Rex (&lt;A href="http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Rex"&gt;http://docs.splunk.com/Documentation/SplunkCloud/7.0.0/SearchReference/Rex&lt;/A&gt;) but it doesn't seem to accept PCRE.  &lt;/P&gt;

&lt;P&gt;Any clues where I can find this in the documentation?  Or give an example of "flattening" fields that have predictable variable content?&lt;/P&gt;</description>
      <pubDate>Mon, 09 Apr 2018 15:26:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-collapse-variable-part-of-url-to-get-a-list-of-urls/m-p/324119#M96740</guid>
      <dc:creator>andrewbeak</dc:creator>
      <dc:date>2018-04-09T15:26:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to collapse variable part of url to get a list of urls?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-collapse-variable-part-of-url-to-get-a-list-of-urls/m-p/324120#M96741</link>
      <description>&lt;P&gt;This query actually works - but the search results don't show the edited field.  If you expand the result to show the fields then you'll see that they have been changed.&lt;/P&gt;</description>
      <pubDate>Mon, 09 Apr 2018 15:49:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-collapse-variable-part-of-url-to-get-a-list-of-urls/m-p/324120#M96741</guid>
      <dc:creator>andrewbeak</dc:creator>
      <dc:date>2018-04-09T15:49:11Z</dc:date>
    </item>
  </channel>
</rss>

