I have a xml _raw="2022-03-02 21:22:39.417 [MESSAGE] [default-threads - 8] [re_messages] - <?xml version="1.0" encoding="UTF-8"?><al:EnvEventDatagram xmlns:mex="http://xxxx" xmlns:bdm="http://xxxx" xmlns:al="http://xxxx" xmlns:xsi="http://www.w3.org/xxxx" xsi:schemaLocation="http://xxxx.xsd"><mex:ManagedApp><mex:IssuerId>com1</mex:IssuerId><mex:Code>abc</mex:Code><mex:DeployedUnitId>123</mex:DeployedUnitId><mex:DxmVersion>1.10</mex:DxmVersion></mex:ManagedApp><mex:ID>456</mex:ID><mex:AID>1</mex:AID><al:SvcEventDatagram><mex:MessageID>aaa</mex:MessageID><al:Alert><al:DA><al:ASQ><al:IssuerId>bbb</al:IssuerId><al:Value>ccc</al:Value></al:ASQ><al:CU><bdm:B><bdm:IssuerId>888</bdm:IssuerId><bdm:Value>ddd</bdm:Value></bdm:B></al:CU><al:YYY><al:LLL>89</al:LLL><al:BNum>28</al:BNum><al:NUM>6</al:NUM></al:YYY><al:FAUTQ><al:Value>vvv</al:Value></al:FAUTQ><al:BA><bdm:TypeQcd><bdm:IssuerId>kkk</bdm:IssuerId><bdm:Value>ABC</bdm:Value></bdm:TypeQcd><bdm:Ccyamt><bdm:MM>88</bdm:MM></bdm:Ccyamt></al:BA><al:BA><bdm:TypeQcd><bdm:IssuerId>abc</bdm:IssuerId><bdm:Value>NNN</bdm:Value></bdm:TypeQcd><bdm:Ccyamt><bdm:MM>22</bdm:MM></bdm:Ccyamt><al:ReasonQcd><al:IssuerId>vvv</al:IssuerId><al:Value>FF</al:Value></al:ReasonQcd></al:BA><al:DATypeQcd><al:Value>mmm</al:Value></al:DATypeQcd><al:OverLimitInd>ii</al:OverLimitInd><al:Qcd><al:Value>N/A</al:Value></al:Qcd></al:DA><al:QQQ><bdm:DescriptionTxt><bdm:Text>HH</bdm:Text></bdm:DescriptionTxt><bdm:StartDttm>2022-03-02</bdm:StartDttm><bdm:ATQ><bdm:IssuerId>77</bdm:IssuerId><bdm:Value>TTT</bdm:Value></bdm:ATQ><bdm:Status><bdm:TypeQcd><bdm:IssuerId>55</bdm:IssuerId><bdm:Value>PPP</bdm:Value></bdm:TypeQcd></bdm:Status><bdm:Ccyamt><bdm:MM>12</bdm:MM></bdm:Ccyamt><bdm:DebitCreditQcd><bdm:IssuerId>AAA</bdm:IssuerId><bdm:Value>GGG</bdm:Value></bdm:DebitCreditQcd><al:TED>2022-03-02</al:TED><al:ProcessDt>2022-03-02</al:ProcessDt></al:QQQ></al:Alert></al:SvcEventDatagram></al:EnvEventDatagram>"
any way can get all <bdm:Value>'s vallues(ddd, ABC etc.) by regex?
I suggest to use SPL's builtin XML parser, spath.
| rename _raw AS temp ``` in case you still need _raw later ```
| eval _raw = replace(temp, "^[^<]+", "") ``` only keep XML ```
| spath
| foreach *.bdm:Value
[eval bdm_values = mvappend(bdm_values, '<<FIELD>>')]
| rename temp AS _raw
| table bdm_values *.bdm:Value ``` display quick validation ```
Using the sample data, output is
bdm_values | al:EnvEventDatagram.al:SvcEventDatagram.al:Alert.al:DA.al:BA.bdm:TypeQcd.bdm:Value | al:EnvEventDatagram.al:SvcEventDatagram.al:Alert.al:DA.al:CU.bdm:B.bdm:Value | al:EnvEventDatagram.al:SvcEventDatagram.al:Alert.al:QQQ.bdm:ATQ.bdm:Value | al:EnvEventDatagram.al:SvcEventDatagram.al:Alert.al:QQQ.bdm:DebitCreditQcd.bdm:Value | al:EnvEventDatagram.al:SvcEventDatagram.al:Alert.al:QQQ.bdm:Status.bdm:TypeQcd.bdm:Value |
ABC NNN ddd TTT GGG PPP | ABC NNN | ddd | TTT | GGG | PPP |
| rex max_match=0 field=_raw "\<bdm:Value\>(?<value>[^\<]+)\<\/bdm:Value\>"
Values in that field could include (properly escaped) angle braces. So you'd have to either:
1) make sure that there aren't such cases
2) account for such cases in your regex
3) use spath instead of raw regex matching
Add the max_match=0 option to the rex command.
| rex max_match=0 "\<bdm:Value>(?<bdfValue>[^\<]+)"
This will put all of the values into a multi-value field called "bdfValue". Use mvexpand to separate them or use eval with mv* functions to work with them.