Splunk Search

How do you extract an XML tag from _raw field?

mukesh2019
Explorer

Hi all,

I'm new to Splunk and don't have much idea of regex.

I'm trying to extract the content of "faultstring" tag only if Detail="RetreiveClaims Service Response payload without Invalid Characters" out of below output .

Sample Input :-

2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message="com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure"|Detail="RetreiveClaims Service Response payload without Invalid Characters"|DetailXML="<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<env:Body>
<env:Fault>
<faultcode>env:Client</faultcode>
<faultstring>Internal Error</faultstring>
<detail>
<device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
<domain>ABCDEF-11500_QA_v000</domain>
<object>ABCDEFI_QA</object>
<date>2018-12-23-04:42:47 EST</date>
<tid>87917743</tid>
<server>abcdef.domain.com</server>
<port>12345</port>
<errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>"  

I'm looking for something like :-

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 | regex

Expected Output

Internal Error

0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@mukesh2019

Can you please try this?

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

My Sample Search:

| makeresults 
| eval _raw="2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message=\"com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure\"|Detail=\"RetreiveClaims Service Response payload without Invalid Characters\"|DetailXML=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <env:Envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:fn=\"http://www.w3.org/2005/xpath-functions\">
 <env:Body>
 <env:Fault>
 <faultcode>env:Client</faultcode>
 <faultstring>Internal Error</faultstring>
 <detail>
 <device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
 <domain>ABCDEF-11500_QA_v000</domain>
 <object>ABCDEFI_QA</object>
 <date>2018-12-23-04:42:47 EST</date>
 <tid>87917743</tid>
 <server>abcdef.domain.com</server>
 <port>12345</port>
 <errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>\"  
" 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

Thanks

View solution in original post

a_m_s
Explorer

If you are explicitly looking for faultstring tag only then the following regex should work.

(?.*)<\/faultstring>

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@mukesh2019

Can you please try this?

index=abcd_nontricare source="*.log" "abc" earliest=-1h  | head 1 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

My Sample Search:

| makeresults 
| eval _raw="2018-12-23 04:42:47,243 483592286   DEBUG   com.xxxx.ead.channels.services.cscden    Aep_HostName=hostname    ServiceOperation=CSCDental_RetrieveClaims     UserId=4234525__         xBroker_HostName=ABCDE|RequestDateTime=2018-12-23T04:42:40.739161|MessageTranID=xxxx-xxxx-xxx-xxx-xxxx|UserID=12345__|AppID=CSC_DENTAL|ExecGrpName=CSCDental414|TimeStamp=2018-12-23T04:42:47.151570|Message=\"com.xxxxx.ead.channels.integration.services.cscdental.account.xxxxxx.CSCDental_RetrieveClaims.RetrieveClaimsFailure\"|Detail=\"RetreiveClaims Service Response payload without Invalid Characters\"|DetailXML=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <env:Envelope xmlns:env=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:fn=\"http://www.w3.org/2005/xpath-functions\">
 <env:Body>
 <env:Fault>
 <faultcode>env:Client</faultcode>
 <faultstring>Internal Error</faultstring>
 <detail>
 <device>DPQAS04 (SISC QA Bastion 4) U03DPB14</device>
 <domain>ABCDEF-11500_QA_v000</domain>
 <object>ABCDEFI_QA</object>
 <date>2018-12-23-04:42:47 EST</date>
 <tid>87917743</tid>
 <server>abcdef.domain.com</server>
 <port>12345</port>
 <errormessage>Failed to process response headers</errormessage></detail></env:Fault></env:Body></env:Envelope>\"  
" 
| rex field=_raw "<faultstring>(?<faultstring>.+)<\/faultstring>" 
| table faultstring

Thanks

mukesh2019
Explorer

Thanks a lot 🙂

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...