<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic XML tags extraction at index time in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/XML-tags-extraction-at-index-time/m-p/532477#M89510</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am trying to create some fields at index time from an XML log.&lt;/P&gt;&lt;P&gt;I prepared the sourcetype definition in the props.conf with the related TRANSFORM, and in the the transforms.conf I have the following:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[xmlkv_extract]
REGEX=\&amp;lt;(.*?)\&amp;gt;(.*?)\&amp;lt;
FORMAT = $1::$2
WRITE_META = true

[xmlkv_extract_new]
REGEX = &amp;lt;email&amp;gt;(.*?)&amp;lt;\/email&amp;gt;&amp;lt;ccard&amp;gt;(.*?)&amp;lt;\/ccard&amp;gt;&amp;lt;company&amp;gt;(.*?)&amp;lt;\/company&amp;gt;&amp;lt;city&amp;gt;(.*?)&amp;lt;\/city&amp;gt;
FORMAT = email::"$1" credit_card::"$2" company::"$3" city::"$4"
WRITE_META = True&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;and this my sample event:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;email&amp;gt;orci.Phasellus.dapibus@egestasSed.ca&amp;lt;/email&amp;gt;&amp;lt;ccard&amp;gt;4539599637112700&amp;lt;/ccard&amp;gt;&amp;lt;city&amp;gt;Hamilton&amp;lt;/city&amp;gt;&amp;lt;company&amp;gt;Eros Proin LLC&amp;lt;/company&amp;gt;&amp;lt;/fst&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;Now, the problem is, if I use the first transform, only the email field is extracted (by the way I tried the regex in regex101 site and it worked with all the fields). If I use the second transform, everything is ok.&lt;/P&gt;&lt;P&gt;Is there some limitation in the index-time field extraction about the "generic" xml tags extraction?&lt;/P&gt;&lt;P&gt;thanks&lt;/P&gt;&lt;P&gt;Fausto&lt;/P&gt;</description>
    <pubDate>Wed, 09 Dec 2020 14:22:23 GMT</pubDate>
    <dc:creator>fsaporito</dc:creator>
    <dc:date>2020-12-09T14:22:23Z</dc:date>
    <item>
      <title>XML tags extraction at index time</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/XML-tags-extraction-at-index-time/m-p/532477#M89510</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am trying to create some fields at index time from an XML log.&lt;/P&gt;&lt;P&gt;I prepared the sourcetype definition in the props.conf with the related TRANSFORM, and in the the transforms.conf I have the following:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[xmlkv_extract]
REGEX=\&amp;lt;(.*?)\&amp;gt;(.*?)\&amp;lt;
FORMAT = $1::$2
WRITE_META = true

[xmlkv_extract_new]
REGEX = &amp;lt;email&amp;gt;(.*?)&amp;lt;\/email&amp;gt;&amp;lt;ccard&amp;gt;(.*?)&amp;lt;\/ccard&amp;gt;&amp;lt;company&amp;gt;(.*?)&amp;lt;\/company&amp;gt;&amp;lt;city&amp;gt;(.*?)&amp;lt;\/city&amp;gt;
FORMAT = email::"$1" credit_card::"$2" company::"$3" city::"$4"
WRITE_META = True&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;and this my sample event:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;email&amp;gt;orci.Phasellus.dapibus@egestasSed.ca&amp;lt;/email&amp;gt;&amp;lt;ccard&amp;gt;4539599637112700&amp;lt;/ccard&amp;gt;&amp;lt;city&amp;gt;Hamilton&amp;lt;/city&amp;gt;&amp;lt;company&amp;gt;Eros Proin LLC&amp;lt;/company&amp;gt;&amp;lt;/fst&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;Now, the problem is, if I use the first transform, only the email field is extracted (by the way I tried the regex in regex101 site and it worked with all the fields). If I use the second transform, everything is ok.&lt;/P&gt;&lt;P&gt;Is there some limitation in the index-time field extraction about the "generic" xml tags extraction?&lt;/P&gt;&lt;P&gt;thanks&lt;/P&gt;&lt;P&gt;Fausto&lt;/P&gt;</description>
      <pubDate>Wed, 09 Dec 2020 14:22:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/XML-tags-extraction-at-index-time/m-p/532477#M89510</guid>
      <dc:creator>fsaporito</dc:creator>
      <dc:date>2020-12-09T14:22:23Z</dc:date>
    </item>
  </channel>
</rss>

