Dashboards & Visualizations

xml field extraction

riqbal
Communicator

I have one xml file
I want to extract the fields/values IN BETWEEN and and throw away any of the lines before the very first and after the very last .
(In XML, the fields/values are located on each line in the form value)
4. Use the date in the ActionDate field and the time in the ActionTime field as the timestamp.

<Interceptor>
             <AttackCoords>-423423445345345.10742916222947</AttackCoords>
             <Outcome>Inteccccn</Outcome>
             <Infiltrators>20</Infiltrators>
             <Enforcer>Iwildwood</Enforcer>
             <ActionDate>2013-04-24</ActionDate>
             <ActionTime>00:07:00</ActionTime>
             <RecordNotes></RecordNotes>
             <NumEscaped>0</NumEscaped>
             <LaunchCoords>-80.23429525620114,24.08680387475695</LaunchCoords>
             <AttackVessel>local</AttackVessel>
         </Interceptor>

below is my props.conf and transforms.conf
props.conf
[dreamcrusher]
BREAK_ONLY_BEFORE =
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_FORMAT =
TIME_PREFIX =
category = Custom
disabled = false
pulldown_type = true
PREAMBLE_REGEX = ^<\S+.*
REPORT-dream = dream

transforms.conf
[dream]
REGEX = ^<(.*?)>(\S+)<

FORMAT = $1::$2

when i check the events there are no search time extraction

Tags (2)
0 Karma
1 Solution

riqbal
Communicator

Referring post : "How to extract XML field data using transforms.conf?"
mentioned regex works with me.

REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2

View solution in original post

riqbal
Communicator

Referring post : "How to extract XML field data using transforms.conf?"
mentioned regex works with me.

REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2

jkat54
SplunkTrust
SplunkTrust

You didn’t say in between what...

Have you tried the xmlkv command? How about xpath or spath? See these links on how to use those:

http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Xmlkv
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Xpath
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Spath

props.conf on the universal forwarder

 [sourcetypeName]
 INDEXED_EXTRACTIONS=xml

or if you want to extract the fields automatically at search time use KV_MODE instead of INDEXED_EXTRACTIONS. INDEXED_EXTRACTIONS actually index the fields which takes more disk space but it makes all the fields available to tstats searches. On a small data source it can be great, on a large data source it can cause more problems than its worth.

 KV_MODE=xml

Also note that INDEXED_EXTRACTIONS occur on the first splunk that sees the data (typically a forwarder, maybe your laptop in this case)

0 Karma

riqbal
Communicator

thanks for sharing important information. in fact, it is my 2nd step to complete the lab.

0 Karma

gkwl22000
New Member

When using KV_MODE=xml, do we still use the LIne_Breaker and Should_Linemerge settings?

0 Karma

jkat54
SplunkTrust
SplunkTrust

i guess they arent needed unless you have the xml nested within another data type like json with nested xml or something.

0 Karma

gkwl22000
New Member

yes it is nest xml

0 Karma
Get Updates on the Splunk Community!

The All New Performance Insights for Splunk

Splunk gives you amazing tools to analyze system data and make business-critical decisions, react to issues, ...

Good Sourcetype Naming

When it comes to getting data in, one of the earliest decisions made is what to use as a sourcetype. Often, ...

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...