Splunk Search

Regular Expression to extract third string when there are multiple tags with same name

kirangurram
Explorer

Hi ,
I need some help with regular expression.

I have a field call "f" which is having XML message.
I want to extract one of the value from a XML Tag .

The tricky part is , each XML has multiple "Val" tags with different content init.

Example : lets say field "f" consists of below values along with some other XML tags. I only want to get "SE" as an output using regular expression.

"SE" will remain in third place in tag.

<Val>123</Val>
<Val>ABC</Val>
<Val>SE</Val>
<Val>Information</Val>

I tried below regular expressions. But they havent meet my criteria to get only the value which is present in third tag.

search | rex field=f "\"(?\w*)<\/Val>""

Above regular expression gives only 123 as an output

search | rex field=f "\"(?\w*)<\/Val>"" max_match=0

Above regular expression gives below output
123
ABC
SE
Information

Need your help to get SE as an output using regular expression.

Thanks for your help in advance.

0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

|makeresults | eval _raw="<Val>123</Val>
<Val>ABC</Val>
<Val>SE</Val>
<Val>Information</Val>"

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution."

| rex max_match=3 "(?ms)<Val>(?<Val>.*?)</Val>"
| eval Val=mvindex(Val, 2)

View solution in original post

woodcock
Esteemed Legend

Like this:

|makeresults | eval _raw="<Val>123</Val>
<Val>ABC</Val>
<Val>SE</Val>
<Val>Information</Val>"

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution."

| rex max_match=3 "(?ms)<Val>(?<Val>.*?)</Val>"
| eval Val=mvindex(Val, 2)

View solution in original post

kirangurram
Explorer

@woodcock , thanks a lot. your solutions works for me.

I have use below query to get per day stats

search | rex field=f max_match=3 "(?ms)<Val>(?<Val>.*?)</Val>"
| eval Val=mvindex(Val, 2)
| search Val=XX
| timechart span=1d count by "n" limit=0
0 Karma

jason_prondak
Explorer

What about not using regex at all and use xpath?

| makeresults
| eval raw="123
ABC
SE
Information"
| rename raw AS _raw
| xpath "/Val[3]" outfield=blah

0 Karma

kirangurram
Explorer

xpath doesnt seems to work @jason.prondak , I am getting incorrect output

0 Karma

FritzWittwer
Contributor

This regex should do it:

(?s)(?:<Val>[^<]*<\/Val>[^<]*){2}<Val>(?<field>\w*)<.*$
0 Karma

kirangurram
Explorer

thanks for your reply @FritzWittwer. I didnt get desired output with the above regex. I am just getting blank output.

let me provide entire field f content to bring better clarity.

field f is having below XML content. I just want to count stats using country code which is present in <Val>FR</Val> tag.

when I used below query , I am getting below output.
index="sample" search
| rex field=f "<Val>"(?\w*)<\/Val>"" max_match=0
| table country

output :
URO
LEH
FR
Information

My desired output :
FR

f: <?xml version="1.0" encoding="UTF-8"?><ns23:EvtMsg xmlns:ns23="http://www.dhl.com/Express/CM/GenericEventMsg/v2" xmlns:ns5="http://www.dhl.com/Express/CM/CM_GenericRequest/v2"><Hdr Id="22de623e-b8ab-4def-b28a-7be0f5b76bd0" Ver="1.038" Dtm="2019-02-21T05:59:46" CorrId="fc6fd801-ad67-47b3-ac0b-adeb0a6e5ed0"><GI SrcAppCd="ABC"><TID Src="E2E" TID="f4a82e5b-9f55-4d96-b8b0-1c1caab398d1"/></GI><Sndr AppCd="ABC" AppVer="2.000" AppNm="ABC"/><Rcp AppCd="ABC" AppNm="ABC"><GId Id="fc6fd801-ad67-47b3-ac0b-adeb0a6e5ed0" IdTp="RQMSGID"/></Rcp><Rcp AppCd="ABC" AppNm="ABC"/></Hdr><Bd><BOEvt><GI CorrId="697b8f2d-163b-43ed-a87b-1a657d2d5428"/><ShpId OrgCCd="US">1234567891</ShpId><Evt><TyCd>ABCAB</TyCd><Dtm Off="+01:00">2019-02-21T05:59:46</Dtm><RecDtm Off="+01:00">2019-02-21T05:59:46</RecDtm><SDtm Off="+01:00">2019-02-21T05:59:46</SDtm><RCd>ABCAB</RCd><RDsc>Receiver Time Window</RDsc><Rmk/><COpsFncId><OpsFncTyCd>HAL</OpsFncTyCd><OpsFncId>LEHURO</OpsFncId><OpsFncIdAddDtEl><Cd>FcCd</Cd><Val>URO</Val></OpsFncIdAddDtEl><OpsFncIdAddDtEl><Cd>SrvaCd</Cd><Val>LEH</Val></OpsFncIdAddDtEl><OpsFncIdAddDtEl><Cd>CtryCd</Cd>**<Val>FR</Val>**</OpsFncIdAddDtEl></COpsFncId><CIndvId>ABC</CIndvId><DatElGrp Cd="RcvDlvStrDtm"><DatEl><Cd>Dtm</Cd><Val Ty="DATETIME">2019-02-21T23:00:00Z</Val></DatEl><DatEl><Cd>TmOff</Cd><Val Ty="CHAR">+01:00</Val></DatEl></DatElGrp><DatElGrp Cd="RcvDlvEndDtm"><DatEl><Cd>Dtm</Cd><Val Ty="DATETIME">2019-02-22T22:59:00Z</Val></DatEl><DatEl><Cd>TmOff</Cd><Val Ty="CHAR">+01:00</Val></DatEl></DatElGrp><DatElGrp Cd="CfgDat"><DatEl><Cd>Cat</Cd><Val>Information</Val></DatEl></DatElGrp><GI CorrId="573a6e52-c795-4066-9091-58d5281edd4e"/></Evt></BOEvt></Bd></ns23:EvtMsg>
0 Karma