Splunk Search

Search XML data inside Text File

bansi
Path Finder

The Log file fed to splunk is a *.txt i.e. Text file but it has XML data inside it as shown below

2010-11-17 12:59:24,617 [main] DEBUG splunk - marshallObjectToXml; 
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<EventLogData xmlns="http:/xyz/EventLogData">
<Data screen-name="ScottTiger">
<DataNode node-type="Contract">
<TransactionAttributes>
<entry key="CONTRACT_ID">contract1_100</entry>
<entry key="MEMBER_ID">Admin1_100</entry>
</TransactionAttributes>
</DataNode>
</Data>
</EventLogData>

I am unable to extract the value of CONTRACT_ID using XPATH or rex or xmlkv. Nothing Works !!! Wondering this might be due to XML embedded inside the text file. I am also not sure how Events are formed !!!

Any pointers/suggestions will be greatly appreciated

Tags (1)
1 Solution

carasso
Splunk Employee
Splunk Employee

xmlkv only seems to extract values if the event is valid xml. I recreated your problem, and the linebreaking is not the issue.

One ugly solution is to extract out the xml with a regex and then call xmlkv...

... | rex "(?s)(?<xml><EventLogData.*</EventLogData>)" | rename _raw as raw | xmlkv | rename raw as _raw 

View solution in original post

carasso
Splunk Employee
Splunk Employee

xmlkv only seems to extract values if the event is valid xml. I recreated your problem, and the linebreaking is not the issue.

One ugly solution is to extract out the xml with a regex and then call xmlkv...

... | rex "(?s)(?<xml><EventLogData.*</EventLogData>)" | rename _raw as raw | xmlkv | rename raw as _raw 

bansi
Path Finder

Basically my question in above post is "How to Search/Extract XML Node-Attribute Values?" For example in my case i would like to extract the Attribute Value of CONTRACT_ID from the below xml snippet

<entry> key="CONTRACT_ID">contract1_100</entry>

Please note my attempts to search/extract CONTRACT_ID Value using XPATH from "DATA_NODE" or through rex is not working

 <DataNode> node-type="Contract">
<TransactionAttributes>
<entry> key="CONTRACT_ID">contract1_100</entry>
<entry> key="MEMBER_ID">Admin1_100</entry>
</TransactionAttributes>
</DataNode>

Please let me know the rex or XPATH to extract CONTRACT_ID value

0 Karma

bansi
Path Finder

Thanks for quick response. I am new to Splunk so not sure how to check what Splunk uses for line breaking. Is their a way to check

I am also not sure in my sample if all is being considered as single event. Please tell me how to check it.

Please let me know how to write regular expression to look over multilines or single line event

Thanks for helping

0 Karma

bfaber
Communicator

The first thing I would look at is the line breaking... Is your sample all being considered a single event? If so, you should (at least) get rex to find this. Make sure your regular expression is set to look over multilines (?m).

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...