<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why does rex match fail when input length exceeds a certain threshold? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156665#M44070</link>
    <description>&lt;P&gt;So it is unrelated to curly bracket, but triggered by left-aggressiveness before field extraction.  If I add left-aggressiveness to the first rex, the first rex will fail the long string, too.  E.g.,&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; | rex field=emul "Begi.+:(?&amp;lt;DATA&amp;gt;.*)"
 | rex field=emul "Begin:.+}} (?&amp;lt;DATA&amp;gt;.+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Output becomes&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;count   DATA
short   bar /stuff
long     
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Fri, 10 Oct 2014 19:17:55 GMT</pubDate>
    <dc:creator>yuanliu</dc:creator>
    <dc:date>2014-10-10T19:17:55Z</dc:date>
    <item>
      <title>Why does rex match fail when input length exceeds a certain threshold?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156663#M44068</link>
      <description>&lt;P&gt;When input length exceeds a certain threshold, it seems that some rex match will fail while others do not.  Consider the following emulation:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=none |stats count
| eval count=mvappend("short","long") | mvexpand count
| eval emul="Begin:foo}} bar "
 + if(count="short",mvjoin(mvrange(10000,15000),"-"),
 mvjoin(mvrange(10000,20000),"-"))
| rex field=emul "Begin:(?&amp;lt;DATA&amp;gt;.*)"
| rex field=emul "Begin:.+}} (?&amp;lt;DATA&amp;gt;.+)"
| eval DATA=replace(DATA,"[\d-]+","/stuff")
| fields - emul
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Note:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;The first two lines simply produces a two-event sample.&lt;/LI&gt;
&lt;LI&gt;Field &lt;CODE&gt;emul&lt;/CODE&gt; is populated using numerals between 10000 and 20000.  The first event will contain a string roughly 5,000x6 byte=30,000 byte long, whereas the second event contains a string roughly 10,000x6 byte=60,000 byte long.&lt;/LI&gt;
&lt;LI&gt;The intent of the two cascaded rex commands is to utilize text behind double curly braces whenever possible.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Because the two events are structurally identical, I expect both to produce &lt;CODE&gt;bar /stuff&lt;/CODE&gt;.  Such is the case when the second mvrange() is up to &lt;CODE&gt;mvrange(10000,19000)&lt;/CODE&gt;, or ~9,000x6 byte=54,000 byte.  Not much longer than that, output becomes&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;count   DATA
short   bar /stuff
long    foo}} bar /stuff
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In other words, the second rex fails to take effect. (The last two commands simply shorten output and have no effect on whether the second rex fails or not.)  I see no error in job inspector and such.  In my real-world search, complicated subsequent commands, including rex, do not appear affected, even after the equivalent of &lt;CODE&gt;"Begin:.+}} (?.+)"&lt;/CODE&gt; fails.&lt;/P&gt;

&lt;P&gt;Is this a bug or is there some parameter I need to tweak?  What is special about &lt;CODE&gt;"Begin:.+}} (?.+)"&lt;/CODE&gt;?&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2014 23:28:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156663#M44068</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2014-10-08T23:28:50Z</dc:date>
    </item>
    <item>
      <title>Re: Why does rex match fail when input length exceeds a certain threshold?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156664#M44069</link>
      <description>&lt;P&gt;If the first rex is removed, the second rex indeed extracts no data with the long string.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2014 23:39:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156664#M44069</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2014-10-08T23:39:20Z</dc:date>
    </item>
    <item>
      <title>Re: Why does rex match fail when input length exceeds a certain threshold?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156665#M44070</link>
      <description>&lt;P&gt;So it is unrelated to curly bracket, but triggered by left-aggressiveness before field extraction.  If I add left-aggressiveness to the first rex, the first rex will fail the long string, too.  E.g.,&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; | rex field=emul "Begi.+:(?&amp;lt;DATA&amp;gt;.*)"
 | rex field=emul "Begin:.+}} (?&amp;lt;DATA&amp;gt;.+)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Output becomes&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;count   DATA
short   bar /stuff
long     
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 10 Oct 2014 19:17:55 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-rex-match-fail-when-input-length-exceeds-a-certain/m-p/156665#M44070</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2014-10-10T19:17:55Z</dc:date>
    </item>
  </channel>
</rss>

