Splunk Search

How do you extract an XML tag from _raw field?

mukesh2019
Explorer

Hi all,

I'm new to Splunk and don't have much idea of regex.

I'm trying to extract the content of "faultstring" tag only if Detail="RetreiveClaims Service Response payload without Invalid Characters" out of below output .

Sample Input :-

2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message="com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure"|Detail="RetreiveClaims Service Response payload without Invalid Characters"|DetailXML="<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<env:Body>
<env:Fault>
<faultcode>env:Client</faultcode>
<faultstring>Internal Error</faultstring>
<detail>
<device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
<domain>ABCDEF-11500_QA_v000</domain>
<object>ABCDEFI_QA</object>
<date>2018-12-23-04:42:47 EST</date>
<tid>87917743</tid>
<server>abcdef.domain.com</server>
<port>12345</port>
<errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>"  

I'm looking for something like :-

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 | regex

Expected Output

Internal Error

0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@mukesh2019

Can you please try this?

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

My Sample Search:

| makeresults 
| eval _raw="2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message=\"com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure\"|Detail=\"RetreiveClaims Service Response payload without Invalid Characters\"|DetailXML=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <env:Envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:fn=\"http://www.w3.org/2005/xpath-functions\">
 <env:Body>
 <env:Fault>
 <faultcode>env:Client</faultcode>
 <faultstring>Internal Error</faultstring>
 <detail>
 <device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
 <domain>ABCDEF-11500_QA_v000</domain>
 <object>ABCDEFI_QA</object>
 <date>2018-12-23-04:42:47 EST</date>
 <tid>87917743</tid>
 <server>abcdef.domain.com</server>
 <port>12345</port>
 <errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>\"  
" 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

Thanks

View solution in original post

a_m_s
Explorer

If you are explicitly looking for faultstring tag only then the following regex should work.

(?.*)<\/faultstring>

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@mukesh2019

Can you please try this?

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

My Sample Search:

| makeresults 
| eval _raw="2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message=\"com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure\"|Detail=\"RetreiveClaims Service Response payload without Invalid Characters\"|DetailXML=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <env:Envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:fn=\"http://www.w3.org/2005/xpath-functions\">
 <env:Body>
 <env:Fault>
 <faultcode>env:Client</faultcode>
 <faultstring>Internal Error</faultstring>
 <detail>
 <device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
 <domain>ABCDEF-11500_QA_v000</domain>
 <object>ABCDEFI_QA</object>
 <date>2018-12-23-04:42:47 EST</date>
 <tid>87917743</tid>
 <server>abcdef.domain.com</server>
 <port>12345</port>
 <errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>\"  
" 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

Thanks

mukesh2019
Explorer

Thanks a lot 🙂

0 Karma
Get Updates on the Splunk Community!

Splunk Platform | Upgrading your Splunk Deployment to Python 3.9

Splunk initially announced the removal of Python 2 during the release of Splunk Enterprise 8.0.0, aiming to ...

From Product Design to User Insights: Boosting App Developer Identity on Splunkbase

co-authored by Yiyun Zhu & Dan Hosaka Engaging with the Community at .conf24 At .conf24, we revitalized the ...

Detect and Resolve Issues in a Kubernetes Environment

We’ve gone through common problems one can encounter in a Kubernetes environment, their impacts, and the ...