Splunk Search

Help With Regular Expression To Extract Values Between XML Element Tags on Multi-line

bansi
Path Finder

How to extract values between Elements tag.

  <DataNode node-type="Contract">
            <TransactionAttributes>
                <entry key="CONTRACT_ID">contract2_100</entry>              
            </TransactionAttributes>
            <Elements>
                <ContractId>true</ContractId>
                <DateOfBirth>true</DateOfBirth>
            </Elements>
        </DataNode>
<DataNode roster-type="search" node-type="Roster">
            <TransactionAttributes>
                <entry key="TRAN_ID">001</entry>                
            </TransactionAttributes>
            <Elements>
                <PhoneNo>true</PhoneNo>
                <SNumber>true</SNumber>
            </Elements>
</DataNode>

The following regular expression erroneously extract values apart from Element tags so Please let me know how to restrict it to retrieve values only between tags

rex "(?m)\<Elements>(?<abc>.*)</Elements>"

results in

<ContractId>true</ContractId><Name name-type="Name">true</Name><DateOfBirth>true</DateOfBirth></Elements></DataNode><DataNode ><TransactionAttributes><entry key="CONTRACT_ID">123</entry><entry 

whereas the expected results is only between Elements tag i.e.

<ContractId>true</ContractId><Name name-type="Name">true</Name><DateOfBirth>true</DateOfBirth>
Tags (1)
0 Karma

ziegfried
Influencer

The problem is that .* matches greedy and so the matched part ends at the last occurrence of "</Elements>". You can make it work by adding the non-greedy quantifier: .*?

So this regex should work as expected:

rex "(?ms)\<Elements\>(?<abc>.*?)\</Elements\>"

In order to extract all matching parts of the event, you have to add the max_match parameter to the rex command. This instruct Splunk to make the resulting field multi-valued.

rex "(?ms)\<Elements\>(?<abc>.*?)\</Elements\>" max_match=999

bansi
Path Finder

Thank. I need one more help. I am stranded extracting "values" only from below xml


%
MALE
VA

I am expecting regex to give me output of values as: %, MALE, VA

rex "(?ms)<SearchElements>(?.?)</SearchElements>" max_match=999 | rex field=abc max_match=50 "<entry key="."><(?[A-Za-z]+)"| eval keys=mvjoin(keys,",") | table abc

Please take a moment to correct the regex

0 Karma

ziegfried
Influencer
0 Karma

bansi
Path Finder

Thank you so much once again. I would greatly appreciate if you could point me to good regular expression website specifically the one which helps me in writing fast Splunk queries.

0 Karma

ziegfried
Influencer

Modified the answer

0 Karma

bansi
Path Finder

Thank you so much. But it doesn't picks the under different i.e. PhoneNo, SNumber as shown in xml of my earlier posting.

0 Karma
Get Updates on the Splunk Community!

AppDynamics Summer Webinars

This summer, our mighty AppDynamics team is cooking up some delicious content on YouTube Live to satiate your ...

SOCin’ it to you at Splunk University

Splunk University is expanding its instructor-led learning portfolio with dedicated Security tracks at .conf25 ...

Credit Card Data Protection & PCI Compliance with Splunk Edge Processor

Organizations handling credit card transactions know that PCI DSS compliance is both critical and complex. The ...