Getting Data In

How can I exclude XML file headers from being indexed?

jdaves
Path Finder

Hey Splunkers!

I have some XML log files that I'm monitoring on a Windows 2003 server. I got my line/event breaking set up and that's all good. However, I'm getting separate events with the XML header and the master tag that defines the beginning of the file.

<?xml version="1.0" encoding="utf-8" ?>
<log>

I also have events with just in them. I want to send these events to nullQueue but I'm not sure of the exact RegEx syntax I should use. Here is my config:

**PROPS.CONF**
[suckylogs:xml]
TRANSFORMS-nullXmlHeader = chuckXmlHeader

**TRANSFORMS.CONF**
[chuckXmlHeader]
REGEX = (?m)^(<\?xml)|(<log>)|(</log>)
DEST_KEY = queue
FORMAT = nullQueue

I'm not sure if I should have 3 separate capture groups like I have here or one big one.

Any assistance is appreciated! Thanks!

Tags (3)
1 Solution

jdaves
Path Finder

I have the answer already after testing it myself, but I wanted to post this question because I did not see any questions for this specific issue. Here is the proper TRANSFORMS entry with one big capture group with multiple conditions (separated by a pipe):

[chuckXmlHeader]
REGEX = (?m)^(<\?xml|<log>|</log>)
DEST_KEY = queue
FORMAT = nullQueue

If you just want to chuck the XML header because you don't have any other events, then this should work for you:

[chuckXmlHeader]
REGEX = (?m)^(<\?xml)
DEST_KEY = queue
FORMAT = nullQueue

View solution in original post

jdaves
Path Finder

I have the answer already after testing it myself, but I wanted to post this question because I did not see any questions for this specific issue. Here is the proper TRANSFORMS entry with one big capture group with multiple conditions (separated by a pipe):

[chuckXmlHeader]
REGEX = (?m)^(<\?xml|<log>|</log>)
DEST_KEY = queue
FORMAT = nullQueue

If you just want to chuck the XML header because you don't have any other events, then this should work for you:

[chuckXmlHeader]
REGEX = (?m)^(<\?xml)
DEST_KEY = queue
FORMAT = nullQueue
Get Updates on the Splunk Community!

SplunkTrust Application Period is Officially OPEN!

It's that time, folks! The application/nomination period for the 2025 SplunkTrust is officially open! If you ...

Splunk Answers Content Calendar, June Edition II

Get ready to dive into Splunk Dashboard panels this week! We'll be tackling common questions around ...

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...