<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Splunk Regex Engine Fails? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463502#M130668</link>
    <description>&lt;P&gt;Same issue, mate. I've used your transforms and it still fails to capture the entire thing and halts at whitespace&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;&lt;BR /&gt;
[aap_fields_discov]&lt;BR /&gt;
REGEX = \[\s*(\S+)\s\=\s(.*?)\s\]&lt;BR /&gt;
REPEAT_MATCH = true&lt;BR /&gt;
WRITE_META = true&lt;BR /&gt;
&lt;/CODE&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 10 Feb 2020 04:00:00 GMT</pubDate>
    <dc:creator>morethanyell</dc:creator>
    <dc:date>2020-02-10T04:00:00Z</dc:date>
    <item>
      <title>Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463495#M130661</link>
      <description>&lt;P&gt;We're trying to extract fields that match this  &lt;CODE&gt;[ FIELD_NAME = S0m3 Valu3 w\ reaLLy $pec!aL ch*rac+3rs ]&lt;/CODE&gt; and write them on tsidx so that their consumable on &lt;CODE&gt;tstats&lt;/CODE&gt;. We're using the transforms-props partnership below&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;# transforms.conf
[hello_transforms]
REGEX = (?&amp;lt;key&amp;gt;[\w]+)\s\=\s(?&amp;lt;value&amp;gt;[^\]]+)
FORMAT = $1::$2
REPEAT_MATCH = true
WRITE_META = true

#props.conf
[hello]
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
pulldown_type = 1
TRANSFORMS-capturer = hello_transforms
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;While it is doing what's expected for most of the fields, (i.e. fields are written on disk, verified through walklex), some values failed to be captured entirely or as expected. For example&lt;BR /&gt;
 &lt;CODE&gt;[ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]&lt;/CODE&gt;&lt;BR /&gt;
Splunk only captured "A". See screenshot below.&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/8336iF76627C5ACC25067/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;REGEX VALID: &lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/8337iC61796994807C5B6/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Do you think this is Splunk's REGEX engine's fault or I have something wrong in my configs? &lt;/P&gt;

&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 03:04:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463495#M130661</guid>
      <dc:creator>morethanyell</dc:creator>
      <dc:date>2020-02-07T03:04:22Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463496#M130662</link>
      <description>&lt;P&gt;Sample:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults 
| eval _raw="Feb 7 11:25:20 SYD-UTIL-02 ADAuditPlus [ Category = LogonReports ] [ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]"
| rex max_match=0 "\[\s*(?&amp;lt;key&amp;gt;\S+)\s\=\s(?&amp;lt;value&amp;gt;.*?)\]"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;EM&gt;transforms.conf&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; REGEX = \[\s*(\S+)\s\=\s(.*?)\]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;need &lt;CODE&gt;]&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;If you use &lt;CODE&gt;FORMAT&lt;/CODE&gt; in &lt;EM&gt;props.conf&lt;/EM&gt; , capture name is not need.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Using FORMAT:
REGEX = ([a-z]+)=([a-z]+)
FORMAT = $1::$2

Not using FORMAT:
REGEX = (?&amp;lt;_KEY_1&amp;gt;[a-z]+)=(?&amp;lt;_VAL_1&amp;gt;[a-z]+)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;cf. &lt;A href="https://docs.splunk.com/Documentation/Splunk/8.0.1/Data/Configureindex-timefieldextraction"&gt;Configureindex-timefieldextraction&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 04:04:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463496#M130662</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-02-07T04:04:11Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463497#M130663</link>
      <description>&lt;P&gt;Same result&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 04:23:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463497#M130663</guid>
      <dc:creator>morethanyell</dc:creator>
      <dc:date>2020-02-07T04:23:28Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463498#M130664</link>
      <description>&lt;P&gt;@marethanyell &lt;BR /&gt;
Do you restart/refresh Splunk?&lt;BR /&gt;
At least, &lt;CODE&gt;[ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]&lt;/CODE&gt;  is not same result.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 04:47:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463498#M130664</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-02-07T04:47:44Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463499#M130665</link>
      <description>&lt;P&gt;Edited transforms.conf with your regex. Stopped Splunk. Deleted index using "clean eventdata" (don't worry, it's a dev machine). Then restarted Splunk. Re indexed the file using one-shot. Still fails to capture the entire value. It stops at &lt;CODE&gt;whitespace&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 04:58:35 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463499#M130665</guid>
      <dc:creator>morethanyell</dc:creator>
      <dc:date>2020-02-07T04:58:35Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463500#M130666</link>
      <description>&lt;P&gt;My old Regex also works on &lt;CODE&gt;| rex&lt;/CODE&gt; but it does not on transforms.conf&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 04:59:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463500#M130666</guid>
      <dc:creator>morethanyell</dc:creator>
      <dc:date>2020-02-07T04:59:22Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463501#M130667</link>
      <description>&lt;P&gt;@morethanyell&lt;BR /&gt;
we both have a mistake. my answer is updated.&lt;BR /&gt;
I'm sorry.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2020 05:24:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463501#M130667</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-02-07T05:24:51Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463502#M130668</link>
      <description>&lt;P&gt;Same issue, mate. I've used your transforms and it still fails to capture the entire thing and halts at whitespace&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;&lt;BR /&gt;
[aap_fields_discov]&lt;BR /&gt;
REGEX = \[\s*(\S+)\s\=\s(.*?)\s\]&lt;BR /&gt;
REPEAT_MATCH = true&lt;BR /&gt;
WRITE_META = true&lt;BR /&gt;
&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2020 04:00:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463502#M130668</guid>
      <dc:creator>morethanyell</dc:creator>
      <dc:date>2020-02-10T04:00:00Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463503#M130669</link>
      <description>&lt;P&gt;(T_T)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sedcmd-whitespace = s/\s/ /g
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;why REGEX halt with white space? &lt;BR /&gt;
I don't understand.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2020 09:49:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463503#M130669</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-02-10T09:49:58Z</dc:date>
    </item>
    <item>
      <title>Re: Splunk Regex Engine Fails?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463504#M130670</link>
      <description>&lt;P&gt;By paper, it should capture this&lt;BR /&gt;
    [ FIELDNAME = The quick brown fox jumps over the lazy dog. ]&lt;BR /&gt;
If you try it on &lt;CODE&gt;| rex&lt;/CODE&gt; or on regex101.com, it does work. But when implemented on transforms.conf, it only captures "The"...so, the field value will be "FIELDNAME = The" instead of entire "FIELDNAME = The quick brown fox jumps over the lazy dog." &lt;/P&gt;

&lt;P&gt;It's not appropriate anymore to show evidence that the regex is working via &lt;CODE&gt;| rex&lt;/CODE&gt; or regex101.com because as I've said before, it does work via those mediums. But not when used in transforms.conf for index-time field extraction, it doesn't.&lt;/P&gt;

&lt;P&gt;Out of frustration, I've changed the strategy of capturing the fields by enclosing values with double quotes (e.g. &lt;CODE&gt;[ FIELDNAME = s0m3 vaLu3 ]&lt;/CODE&gt; becomes &lt;CODE&gt;[ FIELDNAME ="s0m3 vaLu3" ]&lt;/CODE&gt; ) using SEDCMD on props instead of transforms.conf.&lt;/P&gt;

&lt;P&gt;Thanks for the help.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Feb 2020 22:28:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Splunk-Regex-Engine-Fails/m-p/463504#M130670</guid>
      <dc:creator>morethanyell</dc:creator>
      <dc:date>2020-02-10T22:28:37Z</dc:date>
    </item>
  </channel>
</rss>

