Getting Data In

How to parse out fields

a212830
Champion

Hi, I have an XML-like (but not proper XML) feed that I need to parse.

A sample is below, and I need to parse out each field.

Each field will not necessarily be in each event, so I need a method that will find it, without depending upon a previous field or the location within the event itself.

Can anyone help?

Apr 22 19:54:29 138.126.78.80 <STONEGATE_LOG><TIMESTAMP>2019-04-22 15:54:28</TIMESTAMP><LOGID>9999999</LOGID><NODEID>1.2.3.4</NODEID><FACILITY>Packet Filtering</FACILITY><TYPE>Notification</TYPE><EVENT>New connection</EVENT><ACTION>Allow</ACTION><SRC>4.5.6.7</SRC><DST>X.X.X.X</DST><SERVICE>HTTP</SERVICE><PROTOCOL>2</PROTOCOL><SPORT>12345</SPORT><DPORT>99</DPORT><RULEID>60732.1</RULEID><SRCIF>5</SRCIF><COMPID>some text here</COMPID><RECEPTIONTIME>2019-04-22 15:54:29</RECEPTIONTIME><SENDERTYPE>Firewall</SENDERTYPE><SITUATION>Connection_Allowed</SITUATION><EVENTID>99999999999</EVENTID></STONEGATE_LOG>
1 Solution

harsmarvania57
Ultra Champion

Hi,

To extract XML data at search time, you can use below config on Search Head.

props.conf

[yourSourcetype]
REPORT-test = xmlkv_alt

transforms.conf

[xmlkv_alt]
FORMAT = $1::$2
REGEX = <([^>]*)>([^<]*)<\/\1>

EDIT: Please find regex extraction with sample data on https://regex101.com/r/tJVD20/1

View solution in original post

woodcock
Esteemed Legend

All these answers are missing this setting in transforms.conf:

MV_ADD = true

So the full stanza is:

[YourNameHere]
REGEX = <([^\/][^>]+)>(.*?)<\/[^>]+>
FORMAT = $1::$2
MV_ADD = true
0 Karma

harsmarvania57
Ultra Champion

This will not work because REPEAT_MATCH is only valid for Indexed-time field extraction and solution which I have provided is for search time extraction.

0 Karma

woodcock
Esteemed Legend

Quite correct; I always get MV_ADD and REPEAT_MATCH confused. I have corrected my answer.

0 Karma

a212830
Champion

Thanks. This works quite well. Is there anyway of forcing field names to be lowercase?

0 Karma

woodcock
Esteemed Legend

You will have to stack a calculated field on top of this using lower(fieldname).

0 Karma

sloshburch
Splunk Employee
Splunk Employee

I expect that a props.conf entry for calculated field would work with eval's lower()

0 Karma

harsmarvania57
Ultra Champion

Hi,

To extract XML data at search time, you can use below config on Search Head.

props.conf

[yourSourcetype]
REPORT-test = xmlkv_alt

transforms.conf

[xmlkv_alt]
FORMAT = $1::$2
REGEX = <([^>]*)>([^<]*)<\/\1>

EDIT: Please find regex extraction with sample data on https://regex101.com/r/tJVD20/1

ddrillic
Ultra Champion

Interesting, so the xml doesn't have to be well-formed, as the sample above isn't well-formed.

Amazing, because back-then, a similar solution for json was a big hit here - How can we extract a json document within an event?

We ended up with -

REPORT-extract = json_embedded


[json_embedded]
REGEX = "(\w+)"."(\S+?)"
FORMAT = $1::$2
0 Karma

harsmarvania57
Ultra Champion

Yes you can use regex for magic 😉

0 Karma

a212830
Champion

Thanks. I see them appearing on the regex site, but they don't appear as fields on the SH when I try that - are there additional steps requried?

0 Karma

harsmarvania57
Ultra Champion

If you modified config file directly then you need to restart splunk service or you can use /debug/refresh web endpoint

0 Karma

a212830
Champion

How will the fields appear? Will they automatically appear with the names?

0 Karma

harsmarvania57
Ultra Champion

Yes it will automatically appear, I have tested this config in my lab and it is working fine.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

 Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What's New in Splunk Observability - August 2025

What's New We are excited to announce the latest enhancements to Splunk Observability Cloud as well as what is ...

Introduction to Splunk AI

How are you using AI in Splunk? Whether you see AI as a threat or opportunity, AI is here to stay. Lucky for ...