<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Regex for URL parsing in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77574#M19605</link>
    <description>&lt;P&gt;Hi,&lt;BR /&gt;
Its working But how can i extract word.aspx and word.word.word.xap or word.xap all other possible combinations of word and (.)&lt;/P&gt;</description>
    <pubDate>Fri, 28 Jun 2013 05:57:17 GMT</pubDate>
    <dc:creator>ChhayaV</dc:creator>
    <dc:date>2013-06-28T05:57:17Z</dc:date>
    <item>
      <title>Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77569#M19600</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I want to extract url's from the events as a seperate field.&lt;/P&gt;

&lt;P&gt;Here is the log file&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;04/15/2013 17:51:58.09  w3wp.exe (0x113C)                           0x3D50  SharePoint Foundation           Monitoring                      nasq    Medium      Entering monitored scope (Request (GET:https://www.abc.co.in:443/GEOMETRIC/SitePages/MyEnrollment.aspx))
04/15/2013 17:51:58.26  w3wp.exe (0x113C)                           0x4AA0  SharePoint Foundation           Monitoring                      nasq    Medium      Entering monitored scope (Request (GET:https://www.abc.co.in:443/PublicSite/images/header.jpg)) 
04/15/2013 17:59:25.20  w3wp.exe (0x113C)                           0x14B0  SharePoint Foundation           Monitoring                      nasq    Medium      Entering monitored scope (Request (GET:https://www.abc.co.in:443/_LAYOUTS/ClientPortal/SilverlightWebParts/PROD/MyBenefits.xap?ver=5.19))
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Here i just want to extract the url's ends with .aspx and .xap pages like&lt;BR /&gt;
&lt;CODE&gt;&lt;A href="https://www.abc.co.in:443/GEOMETRIC/SitePages/MyEnrollment.aspx" target="test_blank"&gt;https://www.abc.co.in:443/GEOMETRIC/SitePages/MyEnrollment.aspx&lt;/A&gt;&lt;BR /&gt;
&lt;A href="https://www.abc.co.in:443/_LAYOUTS/ClientPortal/SilverlightWebParts/PROD/MyBenefits.xap?ver=5.19" target="test_blank"&gt;https://www.abc.co.in:443/_LAYOUTS/ClientPortal/SilverlightWebParts/PROD/MyBenefits.xap?ver=5.19&lt;/A&gt;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;If i write regex as &lt;CODE&gt;(?i)\(GET:(?P&amp;lt; FIELDNAME&amp;gt;[^\?]+)&lt;/CODE&gt; ,the url is not being extracted properly.&lt;/P&gt;

&lt;P&gt;Please help with the regex.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2013 09:46:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77569#M19600</guid>
      <dc:creator>ChhayaV</dc:creator>
      <dc:date>2013-06-27T09:46:05Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77570#M19601</link>
      <description>&lt;P&gt;Not sure your second example is an aspx file, but I'm not web developer. However the following regex will capture those that end in "&lt;CODE&gt;.aspx&lt;/CODE&gt;"...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;"GET:\w+://(?P&amp;lt;url&amp;gt;[^\)]+\.aspx)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You can try out regular expressions on the following site... handy tool:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://gskinner.com/RegExr/"&gt;http://gskinner.com/RegExr/&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2013 10:21:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77570#M19601</guid>
      <dc:creator>MHibbin</dc:creator>
      <dc:date>2013-06-27T10:21:45Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77571#M19602</link>
      <description>&lt;P&gt;should work;&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;rex "\(GET:(?&amp;lt;fieldname&amp;gt;[^\)]+\.(xap|aspx))"&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2013 10:28:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77571#M19602</guid>
      <dc:creator>kristian_kolb</dc:creator>
      <dc:date>2013-06-27T10:28:39Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77572#M19603</link>
      <description>&lt;P&gt;All current answers rely on the HTTP request being a GET-request. HTTP has several types (GET/POST/HEAD being most common), and if you want all URLs to be captured, you need to take this into consideration.&lt;/P&gt;

&lt;P&gt;The following regex would probably be a better choice to catch all HTTP methods, and all URLs regardless of weird formats (assuming no GET-parameters are appended to the URL - if so you need to take them into consideration).&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(?i)\(Request \([A-Z]+:(?&amp;lt;fieldname&amp;gt;.*\.(aspx|xap))\)\)$
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Jun 2013 12:13:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77572#M19603</guid>
      <dc:creator>burkmat</dc:creator>
      <dc:date>2013-06-27T12:13:25Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77573#M19604</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;
Its working But how can i extract word.aspx and word.word.word.xap or word.xap all other possible combinations of word and (.)&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jun 2013 05:57:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77573#M19604</guid>
      <dc:creator>ChhayaV</dc:creator>
      <dc:date>2013-06-28T05:57:12Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77574#M19605</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;
Its working But how can i extract word.aspx and word.word.word.xap or word.xap all other possible combinations of word and (.)&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jun 2013 05:57:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77574#M19605</guid>
      <dc:creator>ChhayaV</dc:creator>
      <dc:date>2013-06-28T05:57:17Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77575#M19606</link>
      <description>&lt;P&gt;The regex should cover that. It does not cover parameters though, like burkmat said.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Jun 2013 08:04:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77575#M19606</guid>
      <dc:creator>Ayn</dc:creator>
      <dc:date>2013-06-28T08:04:35Z</dc:date>
    </item>
    <item>
      <title>Re: Regex for URL parsing</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77576#M19607</link>
      <description>&lt;P&gt;hi,&lt;BR /&gt;
i want to restrict my regex to first match only&lt;/P&gt;

&lt;P&gt;Leaving Monitored Scope (Request (GET:&lt;A href="https://www.abc/_layouts/ClientPortal/abc/CustomPages/LoginPage.aspx?ReturnUrl=%2f_layouts%2fAuthenticate.aspx%3fSource%3d%252F&amp;amp;Source=%2F)"&gt;https://www.abc/_layouts/ClientPortal/abc/CustomPages/LoginPage.aspx?ReturnUrl=%2f_layouts%2fAuthenticate.aspx%3fSource%3d%252F&amp;amp;Source=%2F)&lt;/A&gt;). Execution Time=17.1800154751023&lt;BR /&gt;
if this is my log entry then i should get only "LoginPage.aspx" but the result is "LoginPage.aspx?ReturnUrl=%2f_layouts%2fAuthenticate.aspx"&lt;/P&gt;</description>
      <pubDate>Tue, 02 Jul 2013 09:38:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-for-URL-parsing/m-p/77576#M19607</guid>
      <dc:creator>ChhayaV</dc:creator>
      <dc:date>2013-07-02T09:38:38Z</dc:date>
    </item>
  </channel>
</rss>

