Solved: How to configure Splunk to read XML files correctl...

darlynna · ‎12-04-2014

I got a problem getting splunk to read my XML files correctly.
Example on one of my XML files:

I want splunk to create a event for every row(the element)
and every event should contain information on which
table it's from. I've tried to do this in some different ways
but none seem to affect splunk.
My latest attempt was to edit props.conf:

[xml-too_small]
 DATETIME_CONFIG = CURRENT
 KV_MODE = xml
 SHOULD_LINEMERGE = True
 BREAK_ONLY_BEFORE = row(surrounded with <>)// Had problems writing html tags 
MUST_BREAK_AFTER = table(surrounded with <>) | /row(surrounded with <>)
 TRUNCATE = 0
 FIELDALIAS-rootfields = table{@name} as Table table.row{@name} as Row table.row.value{@name} as Valuename table.row.value as Value

and to add queue = parsingQueue in inputs.conf.

Thank you!
Darlynna

lguinn2 · ‎12-06-2014

You could do this

[myXML]
DATETIME_CONFIG = CURRENT
KV_MODE = xml
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = \<row>
TRUNCATE = 0

This will give you one event per row element. However, there is no way to add in the table info. A couple of tips:

If you specify BREAK_ONLY_BEFORE, then you shouldn't specify any other breaking criteria.

The < is a special character in regular expressions. You should really escape it with a \ as I did, although I think Splunk may not require this.

Unless you have some compelling reason (which you need to explain), you should not specify the parsingQueue.

If the source file name contains the name of the table, I would definitely use that. Keep the same props.conf as above, but add one more line:

TRANSFORMS-myxml=extract-table-name

and create transforms.conf like this

[extract-table-name]
SOURCE_KEY=MetaData:Source
REGEX=firstpartoffilename(\S+?)\.xml
FORMAT=table::$1
WRITE_META = true

Note that you will need to change the REGEX so that it picks up the actual name of the table from the filename. This creates an index-time field; although I usually dislike index-time fields, this is a case where it may be needed.

View solution in original post

lguinn2 · ‎12-06-2014

You could do this

[myXML]
DATETIME_CONFIG = CURRENT
KV_MODE = xml
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = \<row>
TRUNCATE = 0

This will give you one event per row element. However, there is no way to add in the table info. A couple of tips:

If you specify BREAK_ONLY_BEFORE, then you shouldn't specify any other breaking criteria.

The < is a special character in regular expressions. You should really escape it with a \ as I did, although I think Splunk may not require this.

Unless you have some compelling reason (which you need to explain), you should not specify the parsingQueue.

If the source file name contains the name of the table, I would definitely use that. Keep the same props.conf as above, but add one more line:

TRANSFORMS-myxml=extract-table-name

and create transforms.conf like this

[extract-table-name]
SOURCE_KEY=MetaData:Source
REGEX=firstpartoffilename(\S+?)\.xml
FORMAT=table::$1
WRITE_META = true

Note that you will need to change the REGEX so that it picks up the actual name of the table from the filename. This creates an index-time field; although I usually dislike index-time fields, this is a case where it may be needed.

darlynna · ‎12-14-2014

Huge thanks!:)

oflyt · ‎12-12-2014

Thank you Iguinn 😄

somesoni2 · ‎12-04-2014

At index time, Splitting each row as separate event will be easy but adding the table name would be tough (at least I don't way to do that yet). Any chance you can exclude this requirement?

oflyt · ‎12-05-2014

The file also has the name of the table, maybe it would ,be easier to make use of that? 😜

How to configure Splunk to read XML files correctly?

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience

Join the Conversation

How to configure Splunk to read XML files correctly?

Splunk MCP & Agentic AI: Machine Data Without Limits

Finding Based Detections General Availability

Get Your Hands Dirty (and Your Shoes Comfy): The Splunk Experience