Splunk Search

Help With Regular Expression To Extract Values Between XML Element Tags on Multi-line

bansi
Path Finder

How to extract values between Elements tag.

  <DataNode node-type="Contract">
            <TransactionAttributes>
                <entry key="CONTRACT_ID">contract2_100</entry>              
            </TransactionAttributes>
            <Elements>
                <ContractId>true</ContractId>
                <DateOfBirth>true</DateOfBirth>
            </Elements>
        </DataNode>
<DataNode roster-type="search" node-type="Roster">
            <TransactionAttributes>
                <entry key="TRAN_ID">001</entry>                
            </TransactionAttributes>
            <Elements>
                <PhoneNo>true</PhoneNo>
                <SNumber>true</SNumber>
            </Elements>
</DataNode>

The following regular expression erroneously extract values apart from Element tags so Please let me know how to restrict it to retrieve values only between tags

rex "(?m)\<Elements>(?<abc>.*)</Elements>"

results in

<ContractId>true</ContractId><Name name-type="Name">true</Name><DateOfBirth>true</DateOfBirth></Elements></DataNode><DataNode ><TransactionAttributes><entry key="CONTRACT_ID">123</entry><entry 

whereas the expected results is only between Elements tag i.e.

<ContractId>true</ContractId><Name name-type="Name">true</Name><DateOfBirth>true</DateOfBirth>
Tags (1)
0 Karma

ziegfried
Influencer

The problem is that .* matches greedy and so the matched part ends at the last occurrence of "</Elements>". You can make it work by adding the non-greedy quantifier: .*?

So this regex should work as expected:

rex "(?ms)\<Elements\>(?<abc>.*?)\</Elements\>"

In order to extract all matching parts of the event, you have to add the max_match parameter to the rex command. This instruct Splunk to make the resulting field multi-valued.

rex "(?ms)\<Elements\>(?<abc>.*?)\</Elements\>" max_match=999

bansi
Path Finder

Thank. I need one more help. I am stranded extracting "values" only from below xml


%
MALE
VA

I am expecting regex to give me output of values as: %, MALE, VA

rex "(?ms)<SearchElements>(?.?)</SearchElements>" max_match=999 | rex field=abc max_match=50 "<entry key="."><(?[A-Za-z]+)"| eval keys=mvjoin(keys,",") | table abc

Please take a moment to correct the regex

0 Karma

ziegfried
Influencer
0 Karma

bansi
Path Finder

Thank you so much once again. I would greatly appreciate if you could point me to good regular expression website specifically the one which helps me in writing fast Splunk queries.

0 Karma

ziegfried
Influencer

Modified the answer

0 Karma

bansi
Path Finder

Thank you so much. But it doesn't picks the under different i.e. PhoneNo, SNumber as shown in xml of my earlier posting.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...