Splunk Search

Help With Regular Expression To Extract Values Between XML Element Tags on Multi-line

bansi
Path Finder

How to extract values between Elements tag.

  <DataNode node-type="Contract">
            <TransactionAttributes>
                <entry key="CONTRACT_ID">contract2_100</entry>              
            </TransactionAttributes>
            <Elements>
                <ContractId>true</ContractId>
                <DateOfBirth>true</DateOfBirth>
            </Elements>
        </DataNode>
<DataNode roster-type="search" node-type="Roster">
            <TransactionAttributes>
                <entry key="TRAN_ID">001</entry>                
            </TransactionAttributes>
            <Elements>
                <PhoneNo>true</PhoneNo>
                <SNumber>true</SNumber>
            </Elements>
</DataNode>

The following regular expression erroneously extract values apart from Element tags so Please let me know how to restrict it to retrieve values only between tags

rex "(?m)\<Elements>(?<abc>.*)</Elements>"

results in

<ContractId>true</ContractId><Name name-type="Name">true</Name><DateOfBirth>true</DateOfBirth></Elements></DataNode><DataNode ><TransactionAttributes><entry key="CONTRACT_ID">123</entry><entry 

whereas the expected results is only between Elements tag i.e.

<ContractId>true</ContractId><Name name-type="Name">true</Name><DateOfBirth>true</DateOfBirth>
Tags (1)
0 Karma

ziegfried
Influencer

The problem is that .* matches greedy and so the matched part ends at the last occurrence of "</Elements>". You can make it work by adding the non-greedy quantifier: .*?

So this regex should work as expected:

rex "(?ms)\<Elements\>(?<abc>.*?)\</Elements\>"

In order to extract all matching parts of the event, you have to add the max_match parameter to the rex command. This instruct Splunk to make the resulting field multi-valued.

rex "(?ms)\<Elements\>(?<abc>.*?)\</Elements\>" max_match=999

bansi
Path Finder

Thank. I need one more help. I am stranded extracting "values" only from below xml


%
MALE
VA

I am expecting regex to give me output of values as: %, MALE, VA

rex "(?ms)<SearchElements>(?.?)</SearchElements>" max_match=999 | rex field=abc max_match=50 "<entry key="."><(?[A-Za-z]+)"| eval keys=mvjoin(keys,",") | table abc

Please take a moment to correct the regex

0 Karma

ziegfried
Influencer
0 Karma

bansi
Path Finder

Thank you so much once again. I would greatly appreciate if you could point me to good regular expression website specifically the one which helps me in writing fast Splunk queries.

0 Karma

ziegfried
Influencer

Modified the answer

0 Karma

bansi
Path Finder

Thank you so much. But it doesn't picks the under different i.e. PhoneNo, SNumber as shown in xml of my earlier posting.

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...