Dashboards & Visualizations

xml field extraction

riqbal
Communicator

I have one xml file
I want to extract the fields/values IN BETWEEN and and throw away any of the lines before the very first and after the very last .
(In XML, the fields/values are located on each line in the form value)
4. Use the date in the ActionDate field and the time in the ActionTime field as the timestamp.

<Interceptor>
             <AttackCoords>-423423445345345.10742916222947</AttackCoords>
             <Outcome>Inteccccn</Outcome>
             <Infiltrators>20</Infiltrators>
             <Enforcer>Iwildwood</Enforcer>
             <ActionDate>2013-04-24</ActionDate>
             <ActionTime>00:07:00</ActionTime>
             <RecordNotes></RecordNotes>
             <NumEscaped>0</NumEscaped>
             <LaunchCoords>-80.23429525620114,24.08680387475695</LaunchCoords>
             <AttackVessel>local</AttackVessel>
         </Interceptor>

below is my props.conf and transforms.conf
props.conf
[dreamcrusher]
BREAK_ONLY_BEFORE =
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_FORMAT =
TIME_PREFIX =
category = Custom
disabled = false
pulldown_type = true
PREAMBLE_REGEX = ^<\S+.*
REPORT-dream = dream

transforms.conf
[dream]
REGEX = ^<(.*?)>(\S+)<

FORMAT = $1::$2

when i check the events there are no search time extraction

Tags (2)
0 Karma
1 Solution

riqbal
Communicator

Referring post : "How to extract XML field data using transforms.conf?"
mentioned regex works with me.

REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2

View solution in original post

riqbal
Communicator

Referring post : "How to extract XML field data using transforms.conf?"
mentioned regex works with me.

REGEX = <(\w+)>([^<]+)
FORMAT = $1::$2

jkat54
SplunkTrust
SplunkTrust

You didn’t say in between what...

Have you tried the xmlkv command? How about xpath or spath? See these links on how to use those:

http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Xmlkv
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Xpath
http://docs.splunk.com/Documentation/Splunk/7.1.2/SearchReference/Spath

props.conf on the universal forwarder

 [sourcetypeName]
 INDEXED_EXTRACTIONS=xml

or if you want to extract the fields automatically at search time use KV_MODE instead of INDEXED_EXTRACTIONS. INDEXED_EXTRACTIONS actually index the fields which takes more disk space but it makes all the fields available to tstats searches. On a small data source it can be great, on a large data source it can cause more problems than its worth.

 KV_MODE=xml

Also note that INDEXED_EXTRACTIONS occur on the first splunk that sees the data (typically a forwarder, maybe your laptop in this case)

0 Karma

riqbal
Communicator

thanks for sharing important information. in fact, it is my 2nd step to complete the lab.

0 Karma

gkwl22000
New Member

When using KV_MODE=xml, do we still use the LIne_Breaker and Should_Linemerge settings?

0 Karma

jkat54
SplunkTrust
SplunkTrust

i guess they arent needed unless you have the xml nested within another data type like json with nested xml or something.

0 Karma

gkwl22000
New Member

yes it is nest xml

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...