<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Strange  behavior in regex extraction in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116340#M30881</link>
    <description>&lt;P&gt;That regex will probably miss the last STEP.  That's why my regex string included &lt;CODE&gt;|$&lt;/CODE&gt;.&lt;/P&gt;</description>
    <pubDate>Thu, 21 May 2015 15:58:28 GMT</pubDate>
    <dc:creator>richgalloway</dc:creator>
    <dc:date>2015-05-21T15:58:28Z</dc:date>
    <item>
      <title>Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116336#M30877</link>
      <description>&lt;P&gt;Hi&lt;BR /&gt;
I want to extract the multi-value field "step" and this is how my event looks like:&lt;/P&gt;

&lt;P&gt;STEP:      1005&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
ACTUAL:&lt;BR /&gt;
RETRIES:   1&lt;BR /&gt;&lt;BR /&gt;
STEP:      1006&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
ACTUAL:&lt;BR /&gt;&lt;BR /&gt;
STEP:      1009&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
EXPECTED:  90.5&lt;BR /&gt;
ACTUAL:    91.0&lt;BR /&gt;
STEP:      1011&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
ACTUAL:&lt;BR /&gt;&lt;BR /&gt;
STEP:      1015&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
ACTUAL:&lt;/P&gt;

&lt;P&gt;I have the following regex:&lt;BR /&gt;
... | rex "(?&amp;lt;step&amp;gt;STEP:\s{6}\d+[\w\W\n]+?)STEP:\s{6}" max_match=0&lt;/P&gt;

&lt;P&gt;But for some strange reason this regex skips every other step so I only extracted steps:1005, 1009, and 1015. I believe the problem is associated with the way the regex reads. After a step is extracted, the regex already passed the "STEP:\s{6}" of the next step so the regex cannot find a pattern there and it continues forward until reach the next step.&lt;/P&gt;

&lt;P&gt;This is what I extracted in the field "step":&lt;BR /&gt;
STEP:      1005&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
ACTUAL:&lt;BR /&gt;&lt;BR /&gt;
RETRIES:   1&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;STEP:      1009&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
EXPECTED:  90.5&lt;BR /&gt;
ACTUAL:    91.0&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;STEP:      1015&lt;BR /&gt;
RESULT:    PASS&lt;BR /&gt;
ACTUAL:&lt;/P&gt;

&lt;P&gt;As you can see I am catching the correct pattern with this regex. Please let me know what I could do to extract all the values for this field. &lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 15:02:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116336#M30877</guid>
      <dc:creator>edrivera3</dc:creator>
      <dc:date>2015-05-21T15:02:47Z</dc:date>
    </item>
    <item>
      <title>Re: Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116337#M30878</link>
      <description>&lt;P&gt;I just realize that I can reduce my regex to simply:&lt;BR /&gt;
... | rex "(?&amp;lt;step&amp;gt;STEP:[\w\W\n]+?)STEP:" max_match=0&lt;/P&gt;

&lt;P&gt;This regex gives me the same results, so it doesn't change anything. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 15:08:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116337#M30878</guid>
      <dc:creator>edrivera3</dc:creator>
      <dc:date>2015-05-21T15:08:54Z</dc:date>
    </item>
    <item>
      <title>Re: Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116338#M30879</link>
      <description>&lt;P&gt;Rex was skipping STEPs because your regex string called for two instances of "STEP" to constitute a match.  Using lookahead helps.  Regex101.com works with this regex string: &lt;CODE&gt;(?STEP:[\w\W\n]+?)(?=STEP|$)&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 15:23:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116338#M30879</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2015-05-21T15:23:14Z</dc:date>
    </item>
    <item>
      <title>Re: Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116339#M30880</link>
      <description>&lt;P&gt;Thank you. That's what I need it a lookahead! This is my regex now:&lt;BR /&gt;
(?&amp;lt;step&amp;gt;[\w\W\n]+?)(?=STEP)&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 15:55:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116339#M30880</guid>
      <dc:creator>edrivera3</dc:creator>
      <dc:date>2015-05-21T15:55:32Z</dc:date>
    </item>
    <item>
      <title>Re: Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116340#M30881</link>
      <description>&lt;P&gt;That regex will probably miss the last STEP.  That's why my regex string included &lt;CODE&gt;|$&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 15:58:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116340#M30881</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2015-05-21T15:58:28Z</dc:date>
    </item>
    <item>
      <title>Re: Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116341#M30882</link>
      <description>&lt;P&gt;You are right I am missing the last STEP, but when I include "|$" I only extract the first line:&lt;BR /&gt;
This is what I extracted in the field "step":&lt;BR /&gt;
STEP: 1005&lt;BR /&gt;
STEP: 1006&lt;BR /&gt;
STEP: 1009&lt;BR /&gt;
STEP: 1011&lt;BR /&gt;
STEP: 1015&lt;/P&gt;

&lt;P&gt;So I rather miss the last step than missing info from all other steps. Do you have any idea how to avoid this?&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 17:02:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116341#M30882</guid>
      <dc:creator>edrivera3</dc:creator>
      <dc:date>2015-05-21T17:02:59Z</dc:date>
    </item>
    <item>
      <title>Re: Strange  behavior in regex extraction</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116342#M30883</link>
      <description>&lt;P&gt;Examine your data closely to see if there is anything else you can use as a terminator.&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2015 17:14:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Strange-behavior-in-regex-extraction/m-p/116342#M30883</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2015-05-21T17:14:48Z</dc:date>
    </item>
  </channel>
</rss>

