<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Regex expression help! in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103986#M26890</link>
    <description>&lt;P&gt;&lt;STRONG&gt;Updated to address comment&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;I've updated the Regular Expression to address the data you're working with, and I believe the following will work:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\)\s(?&amp;lt;Message&amp;gt;[^&amp;lt;]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In the sample data you provided (thanks for that!), it extracts the following data to the &lt;STRONG&gt;Message&lt;/STRONG&gt; field:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Message = before Mandai Ave Exit with congestion till BKE Entrance. Avoid lane 4.
Message = after Toa Payoh Exit. Avoid lane 1.
Message = after Thomson Rd.
Message = before Mandai Ave Exit with congestion till BKE Entrance. Avoid lane 4.
Message = before Kallang Way.
Message = after Toa Payoh Exit. Avoid lane 1.
Message = after Thomson Rd.
Message = before Mandai Ave Exit with congestion till BKE Entrance. Avoid lane 4.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Is this what you're looking for?&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Update 2&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;Try this regex:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\)\s(?&amp;lt;Message&amp;gt;.*Exit|[^.]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This basically looks for everything up  to and including the work "Exit", or everything up to the first "." in the message field.  I don't know if it will work in all possible cases, but it will work in the sample you provided. &lt;/P&gt;

&lt;P&gt;As to using a Splunk Generated Pattern (regex), I don't really use that feature so unfortunately I don't know the answer.  &lt;/P&gt;</description>
    <pubDate>Tue, 23 Jul 2013 12:11:06 GMT</pubDate>
    <dc:creator>wpreston</dc:creator>
    <dc:date>2013-07-23T12:11:06Z</dc:date>
    <item>
      <title>Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103985#M26889</link>
      <description>&lt;P&gt;I used regex &lt;CODE&gt;(?i)Area&amp;gt;(?P&amp;lt;Message&amp;gt;[^&amp;lt;]+)&lt;/CODE&gt; to extract the whole field below. &lt;/P&gt;

&lt;P&gt;Originally &lt;CODE&gt;&amp;lt;d:Message&amp;gt;(22/7)17:53 Accident on AYE (towards Tuas) after Jurong Port Rd Exit. Avoid lanes 2 and 3.&amp;lt;/d:Message&amp;gt;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;How can I extract only starting from the word after (Jurong Port Rd Exit) till the word Exit ? The data is updated daily on every 5 minutes interval. Thanks if you guys can help ! &lt;span class="lia-unicode-emoji" title=":grinning_face_with_big_eyes:"&gt;😃&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;More of my XML is at here &lt;A href="http://pastebin.com/98zg3tRX"&gt;Xml Data&lt;/A&gt; (Only need to extract accident event)&lt;/P&gt;

&lt;P&gt;This picture is search by Type="Accident".&lt;/P&gt;

&lt;P&gt;&lt;IMG src="http://splunk-base.splunk.com//storage/extracttime_until_dot.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;P&gt;I have total 6 Types. &lt;/P&gt;

&lt;P&gt;&lt;IMG src="http://splunk-base.splunk.com//storage/type.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;After using the &lt;CODE&gt;| rex "\)\s(?&amp;lt;Message&amp;gt;.*Exit|[^.]+)"  | dedup Message&lt;/CODE&gt; , there are still duplication of: (Note on &lt;BR /&gt;
after Buona Vista Exit&lt;BR /&gt;
after Buona Vista Exit with congestion till Buona Vista Exit&lt;BR /&gt;
after Buona Vista Exit with congestion till Clementi Ave 2 Exit&lt;BR /&gt;
after Buona Vista Exit with congestion till Clementi Ave 6 Exit&lt;BR /&gt;
after Buona Vista Exit with congestion till Jurong Town Hall Exit) is all the same accident at Buona Vista Exit.&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;&lt;IMG src="http://splunk-base.splunk.com//storage/accident_and_location.png" alt="alt text" /&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 11:38:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103985#M26889</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-23T11:38:19Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103986#M26890</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Updated to address comment&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;I've updated the Regular Expression to address the data you're working with, and I believe the following will work:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\)\s(?&amp;lt;Message&amp;gt;[^&amp;lt;]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In the sample data you provided (thanks for that!), it extracts the following data to the &lt;STRONG&gt;Message&lt;/STRONG&gt; field:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Message = before Mandai Ave Exit with congestion till BKE Entrance. Avoid lane 4.
Message = after Toa Payoh Exit. Avoid lane 1.
Message = after Thomson Rd.
Message = before Mandai Ave Exit with congestion till BKE Entrance. Avoid lane 4.
Message = before Kallang Way.
Message = after Toa Payoh Exit. Avoid lane 1.
Message = after Thomson Rd.
Message = before Mandai Ave Exit with congestion till BKE Entrance. Avoid lane 4.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Is this what you're looking for?&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Update 2&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;Try this regex:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\)\s(?&amp;lt;Message&amp;gt;.*Exit|[^.]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This basically looks for everything up  to and including the work "Exit", or everything up to the first "." in the message field.  I don't know if it will work in all possible cases, but it will work in the sample you provided. &lt;/P&gt;

&lt;P&gt;As to using a Splunk Generated Pattern (regex), I don't really use that feature so unfortunately I don't know the answer.  &lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 12:11:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103986#M26890</guid>
      <dc:creator>wpreston</dc:creator>
      <dc:date>2013-07-23T12:11:06Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103987#M26891</link>
      <description>&lt;P&gt;Here you go:&lt;/P&gt;

&lt;P&gt;Including the word exit:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;after\s(?&amp;lt; yourfield &amp;gt;(\w|\s)+)\.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Without the "exit":&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;after\s(?&amp;lt; yourfield &amp;gt;(\w|\s)+)\sExit\.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;*Remove the blanks before and after "yourfield"&lt;/P&gt;

&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 12:15:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103987#M26891</guid>
      <dc:creator>gfuente</dc:creator>
      <dc:date>2013-07-23T12:15:08Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103988#M26892</link>
      <description>&lt;P&gt;check out the update, sorry for the brief summary earlier. &lt;span class="lia-unicode-emoji" title=":grinning_face_with_big_eyes:"&gt;😃&lt;/span&gt; I tried the expression but it wont work. Invalid regex: syntax error&lt;BR /&gt;
Regex does not extract any named fields.&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 13:47:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103988#M26892</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-23T13:47:02Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103989#M26893</link>
      <description>&lt;P&gt;Check out the update, sorry for the brief summary earlier. I tried the regex removing after and before space but it is giving me Invalid regex: syntax error&lt;BR /&gt;
Regex does not extract any named fields.&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 13:48:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103989#M26893</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-23T13:48:27Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103990#M26894</link>
      <description>&lt;P&gt;Is it possible to do Message = before Mandai Ave Exit, Message = after Toa Payoh Exit, Message = after Thomson Rd, Message = before Mandai Ave Exit, Message = before Kallang Way, Message = after Toa Payoh Exit, Message = after Thomson Rd, Message = before Mandai Ave Exit ? Without the Avoid.&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 14:39:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103990#M26894</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-23T14:39:07Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103991#M26895</link>
      <description>&lt;P&gt;I had some data (22/7)23:38 Accident at Guillemard Road/Mountbatten Road Junction, (21/7)9:03 Accident on Dairy Farm Road (towards Bukit Timah Expressway) after Petir Road. Avoid left lane. It is NULL because of the / and (). How can I solve that ?&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 14:41:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103991#M26895</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-23T14:41:10Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103992#M26896</link>
      <description>&lt;P&gt;Have you tried using the Interactive Field Extraction (IFX) feature and having Splunk do the heavy lifting with regex while you feed it examples to train it? &lt;A href="http://docs.splunk.com/Documentation/Splunk/5.0.3/Knowledge/ExtractfieldsinteractivelywithIFX"&gt;http://docs.splunk.com/Documentation/Splunk/5.0.3/Knowledge/ExtractfieldsinteractivelywithIFX&lt;/A&gt; &lt;BR /&gt;
A second advantage is that this creates a persistent field definition unlike rex command which is transient.&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jul 2013 21:37:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103992#M26896</guid>
      <dc:creator>paddygriffin</dc:creator>
      <dc:date>2013-07-23T21:37:00Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103993#M26897</link>
      <description>&lt;P&gt;Thank you sooo much ! Have a great day ! Good job !&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2013 00:42:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103993#M26897</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-24T00:42:55Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103994#M26898</link>
      <description>&lt;P&gt;thanks will try it out !&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2013 01:23:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103994#M26898</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-24T01:23:13Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103995#M26899</link>
      <description>&lt;P&gt;I just realise that there are still a little bit of duplication, check out the update last picture. Is there any way to remove ? I used | dedup Message and is not helping.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2013 07:15:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103995#M26899</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-24T07:15:52Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103996#M26900</link>
      <description>&lt;P&gt;rex ")s(?&lt;MESSAGE&gt;.*Exit|[^.]+)"  | dedup Message&lt;BR /&gt;
Case sensitivity in field names: I notice you used "message" [all lower case] in the regex but "Message" in the dedup. Field names are case sensitive so this may be part of your problem&lt;/MESSAGE&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2013 12:21:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103996#M26900</guid>
      <dc:creator>paddygriffin</dc:creator>
      <dc:date>2013-07-24T12:21:09Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103997#M26901</link>
      <description>&lt;P&gt;Adding another answer here to avoid confusing the issue with all the different regular expressions.  I'll go ahead and apologize now since this will be a pretty long winded answer.  There may be a much simpler or easier way to extract these that some Splunk ninja out there knows, but this is what I could come up with.  I'd recommend testing each of these regular expressions on the command line first and, if they work for you, putting them into your transforms.conf so that you don't have to enter them on the search bar every time you need them.  &lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Createandmaintainsearch-timefieldextractionsthroughconfigurationfiles"&gt;This page&lt;/A&gt; in the Knowledge Manager manual explains how to put them into transforms.conf and represent them in props.conf if you have any questions about it.&lt;/P&gt;

&lt;P&gt;I've been thinking a lot about this one and I think I've figured out the pattern your events follow.  I'll write out the structure I see, then how to extract each part of it, using the following event as an example (my field names might not match yours, but go with me for a minute): &lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;example&lt;/STRONG&gt;:  &lt;MESSAGE&gt;(22/7)19:55 Accident on ECP (towards Changi Airport) after Maxwell Rd Entrance. Avoid lane 1.&lt;A href="https://answers.splunk.comd:Message"&gt;/d:Message&lt;/A&gt;&lt;/MESSAGE&gt;&lt;/P&gt;

&lt;P&gt;This event is made up of:&lt;/P&gt;

&lt;P&gt;One or more Accident Locations:  &lt;STRONG&gt;(22/7)19:55 Accident on ECP&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(date)time Accident at|on &amp;lt;Accident_Location&amp;gt;  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Followed by a direction modifier:  &lt;STRONG&gt;(towards Tuas)&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(towards &amp;lt;Direction_Modifier&amp;gt;)  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Followed by a location modifier:  &lt;STRONG&gt;after Maxwell Rd Entrance.&lt;/STRONG&gt;  Note that this field is difficult to extract because its endpoint is arbitrary, i.e. does the field stop at Mandai Ave or at Mandai Ave Exit?  Sometimes it stops at Entrance, as in the example.  &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;before|after &amp;lt;Location_Modifier&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Optionally followed by a condition modifier:  (missing from this event, but something like "with congestion blah blah".)  Note that this field is difficult to extract since it appears that this field starts with arbitrary key words, like "with".  Extracting this one will be an evolving experience for you as you come across the arbitrary key words and continue to add them to the rex.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;with &amp;lt;Condition_Modifier&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Optionally followed by traffic Advice:  Avoid lane 1.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;Advice&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;They seem to basically follow this formula, so you can use the following regex's to extract all of these fields.  They should account for the special cases where there is more than one Location in the same record.  Adjust the field names to match the field names you want to use.   (Again, note that these extractions may not cover every possible instance since I don't know your data, this is just how it appears to me.  You know your data much better than I and can adapt the regex's to meet your needs)&lt;/P&gt;

&lt;P&gt;To extract the &lt;STRONG&gt;Location&lt;/STRONG&gt; field:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex ":\d+\sAccident\s(on|at)\s(?&amp;lt;Location&amp;gt;(\w|\s|[?\/])+)?,?\s\("
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;To extract the &lt;STRONG&gt;Direction_Modifier&lt;/STRONG&gt; field: &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\(towards\s(?&amp;lt;Direction_Modifier)[^\)]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;EM&gt;Be sure to add in any additional keywords that start this field, like "towards".  For example, if there is some data where this field starts with "near", modify the rex like this to account for it:&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\((towards|near)\s(?&amp;lt;Direction_Modifier)[^\)]+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;To extract the &lt;STRONG&gt;Location_Modifier&lt;/STRONG&gt; field will take some work on your part, and will be an evolving experience for you as you come across the arbitrary key words and continue to add them to the rex.  The rex I set up below ends the field caputre after a street designator, entry or exit designator, a period(.), or before the word "with" (since the Condition_Modifier field seems to always start with it).  You will need to add any other street types or abbreviations into the piped list inside the rex if there are any that I missed.  You will also need to add any other words besides "with" that are the start of the Conditional Modifier:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... your search ... | rex "\D\)\s(?&amp;lt;Location_Modifier&amp;gt;[^\.]*?(Exit|Road|Entrance|Avenue|Junction|Parkway|Rd|Pkwy|Ave|Way)\s?(Exit|Entrance)?)\s?(with)?"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm going to leave off the traffic Advice field since I've rambled on for long enough.  Hopefully this gets you what you need and I think it covers all the cases you've posted about so far.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jul 2013 15:55:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103997#M26901</guid>
      <dc:creator>wpreston</dc:creator>
      <dc:date>2013-07-24T15:55:22Z</dc:date>
    </item>
    <item>
      <title>Re: Regex expression help!</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103998#M26902</link>
      <description>&lt;P&gt;Thanks will figure out on all cases &lt;span class="lia-unicode-emoji" title=":grinning_face_with_big_eyes:"&gt;😃&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jul 2013 01:21:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Regex-expression-help/m-p/103998#M26902</guid>
      <dc:creator>kailun92</dc:creator>
      <dc:date>2013-07-25T01:21:39Z</dc:date>
    </item>
  </channel>
</rss>

