<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Why does my regular expression work with rex, but not as a configured field extraction? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237737#M70630</link>
    <description>&lt;P&gt;I'm trying to extract a value from a fairly simple XML document. My regular expression works fine in search (rex) and also in python, however, it is not working as a field extraction. &lt;/P&gt;

&lt;P&gt;Here is an example, with some details and links omitted, the part I am interested in is simply the final "true" Command outcome value. Note that the response can vary greatly and there can be other xml elements before/after this command outcome value.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;?xml version="1.0" encoding="utf-8"?&amp;gt;&amp;lt;d:SetBookingReference xmlns:d="..." xmlns:m="..." xmlns:georss="..." ..." m:type="Framework.CommandOutcome"&amp;gt;&amp;lt;d:OK m:type="Edm.Boolean"&amp;gt;true&amp;lt;/d:OK&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Let's assume this field is attached to each event in my "Events" index, the field is called &lt;STRONG&gt;XMLField&lt;/STRONG&gt;. The following search works perfectly in extracting the value true/false from all the responses.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=Events | rex field=XMLField "CommandOutcome[^&amp;lt;&amp;gt;]*&amp;gt;&amp;lt;[^&amp;lt;&amp;gt;]*&amp;gt;(?&amp;lt;CommandOutcome2&amp;gt;[^&amp;lt;&amp;gt;]*)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Here is how my field extraction looks, it is assigned to the correct index and is an "inline" extraction.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;CommandOutcome[^&amp;lt;&amp;gt;]*&amp;gt;&amp;lt;[^&amp;lt;&amp;gt;]*&amp;gt;(?&amp;lt;CommandOutcome&amp;gt;[^&amp;lt;&amp;gt;]*) in XMLField
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I've got tens of regular expressions working as field extractions, I've got this particular expression working in search and in a python script, I'm just really out of ideas as to why it's not working in the field extraction. I originally had quotes, but I replaced these with [^&amp;lt;&amp;gt;]* to avoid awkward looking escape sequences on the quotes, I've also tried escaping the "&amp;lt;" and "&amp;gt;" signs, but the regex still fails.&lt;/P&gt;

&lt;P&gt;Any ideas? Thanks!&lt;/P&gt;</description>
    <pubDate>Wed, 20 Jan 2016 12:52:20 GMT</pubDate>
    <dc:creator>jpanderson</dc:creator>
    <dc:date>2016-01-20T12:52:20Z</dc:date>
    <item>
      <title>Why does my regular expression work with rex, but not as a configured field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237737#M70630</link>
      <description>&lt;P&gt;I'm trying to extract a value from a fairly simple XML document. My regular expression works fine in search (rex) and also in python, however, it is not working as a field extraction. &lt;/P&gt;

&lt;P&gt;Here is an example, with some details and links omitted, the part I am interested in is simply the final "true" Command outcome value. Note that the response can vary greatly and there can be other xml elements before/after this command outcome value.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;?xml version="1.0" encoding="utf-8"?&amp;gt;&amp;lt;d:SetBookingReference xmlns:d="..." xmlns:m="..." xmlns:georss="..." ..." m:type="Framework.CommandOutcome"&amp;gt;&amp;lt;d:OK m:type="Edm.Boolean"&amp;gt;true&amp;lt;/d:OK&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Let's assume this field is attached to each event in my "Events" index, the field is called &lt;STRONG&gt;XMLField&lt;/STRONG&gt;. The following search works perfectly in extracting the value true/false from all the responses.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=Events | rex field=XMLField "CommandOutcome[^&amp;lt;&amp;gt;]*&amp;gt;&amp;lt;[^&amp;lt;&amp;gt;]*&amp;gt;(?&amp;lt;CommandOutcome2&amp;gt;[^&amp;lt;&amp;gt;]*)"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Here is how my field extraction looks, it is assigned to the correct index and is an "inline" extraction.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;CommandOutcome[^&amp;lt;&amp;gt;]*&amp;gt;&amp;lt;[^&amp;lt;&amp;gt;]*&amp;gt;(?&amp;lt;CommandOutcome&amp;gt;[^&amp;lt;&amp;gt;]*) in XMLField
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I've got tens of regular expressions working as field extractions, I've got this particular expression working in search and in a python script, I'm just really out of ideas as to why it's not working in the field extraction. I originally had quotes, but I replaced these with [^&amp;lt;&amp;gt;]* to avoid awkward looking escape sequences on the quotes, I've also tried escaping the "&amp;lt;" and "&amp;gt;" signs, but the regex still fails.&lt;/P&gt;

&lt;P&gt;Any ideas? Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 20 Jan 2016 12:52:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237737#M70630</guid>
      <dc:creator>jpanderson</dc:creator>
      <dc:date>2016-01-20T12:52:20Z</dc:date>
    </item>
    <item>
      <title>Re: Why does my regular expression work with rex, but not as a configured field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237738#M70631</link>
      <description>&lt;P&gt;Curious... is XMLField a search time extraction or an index time extraction?&lt;/P&gt;

&lt;P&gt;If the XMLField field is a search time extraction, then it needs to happen in the props prior to the new extraction.  &lt;/P&gt;

&lt;P&gt;Also in your search, the second CommandOutcome has a 2 at the end... but in your sedcmd example, there isnt a 2 at the end of the 2nd CommandOutcome&lt;/P&gt;</description>
      <pubDate>Wed, 20 Jan 2016 13:11:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237738#M70631</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2016-01-20T13:11:23Z</dc:date>
    </item>
    <item>
      <title>Re: Why does my regular expression work with rex, but not as a configured field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237739#M70632</link>
      <description>&lt;P&gt;Hi, apologies if this is not relevant to you but have you tried the spath command?&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.3.2/SearchReference/spath"&gt;http://docs.splunk.com/Documentation/Splunk/6.3.2/SearchReference/spath&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;You can also do the following in your props.conf in order to let Splunk parse the XML automatically for you:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[yoursourcetype]
KV_MODE = xml
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;*&lt;EM&gt;More about KV_MODE: *&lt;/EM&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;KV_MODE = [none|auto|auto_escaped|multi|json|xml]
* Used for search-time field extractions only.
* Specifies the field/value extraction mode for the data.
* Set KV_MODE to one of the following:
  * none: if you want no field/value extraction to take place.
  * auto: extracts field/value pairs separated by equal signs.
  * auto_escaped: extracts fields/value pairs separated by equal signs and
                  honors \" and \\ as escaped sequences within quoted
                  values, e.g field="value with \"nested\" quotes"
  * multi: invokes the multikv search command to expand a tabular event into
           multiple events.
  * xml : automatically extracts fields from XML data.
  * json: automatically extracts fields from JSON data.
* Setting to 'none' can ensure that one or more user-created regexes are not
  overridden by automatic field/value extraction for a particular host,
  source, or source type, and also increases search performance.
* Defaults to auto.
* The 'xml' and 'json' modes will not extract any fields when used on data
  that isn't of the correct format (JSON or XML).
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 20 Jan 2016 13:19:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237739#M70632</guid>
      <dc:creator>javiergn</dc:creator>
      <dc:date>2016-01-20T13:19:17Z</dc:date>
    </item>
    <item>
      <title>Re: Why does my regular expression work with rex, but not as a configured field extraction?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237740#M70633</link>
      <description>&lt;P&gt;I have one of the fields named 2 so I can differentiate between the two fields and find out when the field extraction worked.&lt;/P&gt;

&lt;P&gt;XMLField is an index time extraction, I think. This data source is JSON objects generated in a python script, so I would think the XMLField is created at index time so the extraction should work on it. But that might explain it as I can't get any extraction to work on the field.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Jan 2016 13:44:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-does-my-regular-expression-work-with-rex-but-not-as-a/m-p/237740#M70633</guid>
      <dc:creator>jpanderson</dc:creator>
      <dc:date>2016-01-20T13:44:31Z</dc:date>
    </item>
  </channel>
</rss>

