<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How can I exclude XML file headers from being indexed? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-can-I-exclude-XML-file-headers-from-being-indexed/m-p/228099#M44447</link>
    <description>&lt;P&gt;I have the answer already after testing it myself, but I wanted to post this question because I did not see any questions for this specific issue. Here is the proper TRANSFORMS entry with one big capture group with multiple conditions (separated by a pipe):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[chuckXmlHeader]
REGEX = (?m)^(&amp;lt;\?xml|&amp;lt;log&amp;gt;|&amp;lt;/log&amp;gt;)
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If you just want to chuck the XML header because you don't have any other events, then this should work for you:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[chuckXmlHeader]
REGEX = (?m)^(&amp;lt;\?xml)
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 14 Nov 2016 17:10:13 GMT</pubDate>
    <dc:creator>jdaves</dc:creator>
    <dc:date>2016-11-14T17:10:13Z</dc:date>
    <item>
      <title>How can I exclude XML file headers from being indexed?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-can-I-exclude-XML-file-headers-from-being-indexed/m-p/228098#M44446</link>
      <description>&lt;P&gt;Hey Splunkers!&lt;/P&gt;

&lt;P&gt;I have some XML log files that I'm monitoring on a Windows 2003 server. I got my line/event breaking set up and that's all good. However, I'm getting separate events with the XML header and the master tag  that defines the beginning of the file.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;?xml version="1.0" encoding="utf-8" ?&amp;gt;
&amp;lt;log&amp;gt;
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I also have events with just  in them. I want to send these events to nullQueue but I'm not sure of the exact RegEx syntax I should use. Here is my config:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;**PROPS.CONF**
[suckylogs:xml]
TRANSFORMS-nullXmlHeader = chuckXmlHeader

**TRANSFORMS.CONF**
[chuckXmlHeader]
REGEX = (?m)^(&amp;lt;\?xml)|(&amp;lt;log&amp;gt;)|(&amp;lt;/log&amp;gt;)
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm not sure if I should have 3 separate capture groups like I have here or one big one.&lt;/P&gt;

&lt;P&gt;Any assistance is appreciated! Thanks!&lt;/P&gt;</description>
      <pubDate>Mon, 14 Nov 2016 17:07:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-can-I-exclude-XML-file-headers-from-being-indexed/m-p/228098#M44446</guid>
      <dc:creator>jdaves</dc:creator>
      <dc:date>2016-11-14T17:07:34Z</dc:date>
    </item>
    <item>
      <title>Re: How can I exclude XML file headers from being indexed?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-can-I-exclude-XML-file-headers-from-being-indexed/m-p/228099#M44447</link>
      <description>&lt;P&gt;I have the answer already after testing it myself, but I wanted to post this question because I did not see any questions for this specific issue. Here is the proper TRANSFORMS entry with one big capture group with multiple conditions (separated by a pipe):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[chuckXmlHeader]
REGEX = (?m)^(&amp;lt;\?xml|&amp;lt;log&amp;gt;|&amp;lt;/log&amp;gt;)
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If you just want to chuck the XML header because you don't have any other events, then this should work for you:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[chuckXmlHeader]
REGEX = (?m)^(&amp;lt;\?xml)
DEST_KEY = queue
FORMAT = nullQueue
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 14 Nov 2016 17:10:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-can-I-exclude-XML-file-headers-from-being-indexed/m-p/228099#M44447</guid>
      <dc:creator>jdaves</dc:creator>
      <dc:date>2016-11-14T17:10:13Z</dc:date>
    </item>
  </channel>
</rss>

