Dashboards & Visualizations

xml field extraction

riqbal
Communicator

I have one xml file
I want to extract the fields/values IN BETWEEN and and throw away any of the lines before the very first and after the very last .
(In XML, the fields/values are located on each line in the form value)
4. Use the date in the ActionDate field and the time in the ActionTime field as the timestamp.

<Interceptor>
             <AttackCoords>-423423445345345.10742916222947</AttackCoords>
             <Outcome>Inteccccn</Outcome>
             <Infiltrators>20</Infiltrators>
             <Enforcer>Iwildwood</Enforcer>
             <ActionDate>2013-04-24</ActionDate>
             <ActionTime>00:07:00</ActionTime>
             <RecordNotes></RecordNotes>
             <NumEscaped>0</NumEscaped>
             <LaunchCoords>-80.23429525620114,24.08680387475695</LaunchCoords>
             <AttackVessel>local</AttackVessel>
         </Interceptor>

below is my props.conf and transforms.conf
props.conf
[dreamcrusher]
BREAK_ONLY_BEFORE =
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_FORMAT =
TIME_PREFIX =
category = Custom
disabled = false
pulldown_type = true
PREAMBLE_REGEX = ^<\S+.*
REPORT-dream = dream

transforms.conf
[dream]
REGEX = ^<(.*?)>(\S+)<

FORMAT = $1::$2

when i check the events there are no search time extraction

Tags (2)
0 Karma
1 Solution

riqbal
Communicator

Referring post : "How to extract XML field data using transforms.conf?"
mentioned regex works with me.

REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2

View solution in original post

riqbal
Communicator

Referring post : "How to extract XML field data using transforms.conf?"
mentioned regex works with me.

REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2

jkat54
SplunkTrust
SplunkTrust

You didn’t say in between what...

Have you tried the xmlkv command? How about xpath or spath? See these links on how to use those:

http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Xmlkv
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Xpath
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Spath

props.conf on the universal forwarder

 [sourcetypeName]
 INDEXED_EXTRACTIONS=xml

or if you want to extract the fields automatically at search time use KV_MODE instead of INDEXED_EXTRACTIONS. INDEXED_EXTRACTIONS actually index the fields which takes more disk space but it makes all the fields available to tstats searches. On a small data source it can be great, on a large data source it can cause more problems than its worth.

 KV_MODE=xml

Also note that INDEXED_EXTRACTIONS occur on the first splunk that sees the data (typically a forwarder, maybe your laptop in this case)

0 Karma

riqbal
Communicator

thanks for sharing important information. in fact, it is my 2nd step to complete the lab.

0 Karma

gkwl22000
New Member

When using KV_MODE=xml, do we still use the LIne_Breaker and Should_Linemerge settings?

0 Karma

jkat54
SplunkTrust
SplunkTrust

i guess they arent needed unless you have the xml nested within another data type like json with nested xml or something.

0 Karma

gkwl22000
New Member

yes it is nest xml

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud’s AI Assistant in Action Series: Analyzing and ...

This is the second post in our Splunk Observability Cloud’s AI Assistant in Action series, in which we look at ...

Elevate Your Organization with Splunk’s Next Platform Evolution

 Thursday, July 10, 2025  |  11AM PDT / 2PM EDT Whether you're managing complex deployments or looking to ...

Splunk Answers Content Calendar, June Edition

Get ready for this week’s post dedicated to Splunk Dashboards! We're celebrating the power of community by ...