Splunk Search

How to write the regex to extract fields from my sample XML data?

Shan
Builder

Sample data:

             <id>WGBSTH8180T</id>
         <sytems>
         <sys_Id>14502</sys_Id>
         <name>GYS</name>
         <version>9901</version>
         <ip_address>172.11.11.212</ip_address>
         <connector>
         <connector_Id>TH818AST001A</connector_Id></connector></sytems>

I can able to get the value WGBSTH8180T with the regex like | rex field=Info "(?ms)(?.*?)"

Can anyone help me out with how to extract the values sys_Id (systems), name (systems), version (systems), ip_address (systems), and connector_Id (connector) from the data above?

I'm using regex as mentioned below, but its not working. Please help me out to write regex:

     | rex field=Info "(?ms)<sytems><sys_Id>(?<sytems>.*?)</sys_Id></sytems>"
     | rex field=Info "(?ms)<sytems><name>(?<Name>.*?)</name></sytems>"
     | rex field=Info "(?ms)<sytems><version>(?<Version>.*?)</version></sytems>"
     | rex field=Info "(?ms)<sytems><ip_address>(?<Ip_Address>.*?)</ip_address></sytems>"
     | rex field=Info "(?ms)<sytems><connector><connector_Id>(?<Connector_Ids>.*?)</sys_Id></connector_Id></sytems>"

Thanks in advance

0 Karma

ddrillic
Ultra Champion
0 Karma

jplumsdaine22
Influencer
0 Karma

bmacias84
Champion

The following regex matches in 66 or less steps.

rex field=info "\<id\>(?<id>[^\<]+)\<[^\<]+\<[^\s]+\>\s+\<sys_Id\>(?<sys_Id>[^\<]+)\<[^\<]+\<(?<name>[^\<]+)\<[^\<]+\<(?<version>[^\<]+)\<[^\<]+\<ip_address\>(?<ip_address>[^\<]+)\<[^\<]+\<[^\s]+\>\s+\<(?<connector_id>[^\<]+)"

All the answers given so far will work. Have poor or non-optimized regex statements for large dataset cause poor search performance. What work well for 10k events doesn't work well 1Billion events.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

There are three reasons the rex commands are not working:

1) The characters between "<sytems>" and the following tag are not accounted for.
2) Slashes must be escaped.
3) The last rex has closing tags in the wrong order.

These rex commands should work, although @jplumsdaine22's answer is better.

| rex field=info "<sys_Id>(?<sytems>.*?)<\/sys_Id>"
| rex field=Info "<name>(?<Name>.*?)<\/name>"
| rex field=Info "<version>(?<Version>.*?)<\/version>"
| rex field=Info "<ip_address>(?<Ip_Address>.*?)<\/ip_address>"
| rex field=Info "<connector_Id>(?<Connector_Ids>.*?)<\/connector_Id>"
---
If this reply helps you, Karma would be appreciated.
0 Karma

jplumsdaine22
Influencer

If the log files you are indexing are valid xml, just use the spath command. See the search reference

http://docs.splunk.com/Documentation/Splunk/6.3.3/SearchReference/Spath

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...