Splunk Enterprise

How do we extract field header to a value that changes and assigns to new value. Working in regex101 but not in Splunk

sandeepreddy947
Path Finder

I need help in regex for key and value to be extracted from raw data, below regex working with xml_kv_extraction. While its working in regex101 but not in splunk with rex, any suggesstions.

<(?<field_header>[^>]+)>(?<field_value>[^<]+)<\/\1>https://regex101.com/r/IBsMhK/1 

eg: events with

<field_title><field_header1>field_value1</field_header1><field_header2>field_value2</field_header2></field_title>

Should appear fields as below.

field title = <field_header1>field_value1</field_header1><field_header2>field_value2</field_header2>

field_header1=field_value1

field_header2=field_value2

 

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:36:22.742479", LAST_UPDATE_DATE="1997-10-10 13:36:22.74", ACTION="externalFactor", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactor><current>parker</current><keywordp><encrypted>true</encrypted><keywordp>******</keywordp></keywordp><boriskhan>boriskhan1-CMX_PRTY</boriskhan></externalFactor>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.388887", LAST_UPDATE_DATE="1997-10-10 13:03:58.388", ACTION="externalFactor.RESPONSE", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactorReturn><roleName>ROLE.CustomerManager</roleName><roleName>ROLE.DataSteward</roleName><pepres>false</pepres><externalFactor>false</externalFactor><parkeristrator>true</parkeristrator><current>parker</current></externalFactorReturn>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.384984", LAST_UPDATE_DATE="1997-10-10 13:03:58.384", ACTION="externalFactor.RESPONSE", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactorReturn><roleName>ROLE.CustomerManager</roleName><roleName>ROLE.DataSteward</roleName><pepres>false</pepres><externalFactor>false</externalFactor><parkeristrator>true</parkeristrator><current>parker</current></externalFactorReturn>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.384947", LAST_UPDATE_DATE="1997-10-10 13:03:58.384", ACTION="externalFactor.RESPONSE", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactorReturn><roleName>ROLE.CustomerManager</roleName><roleName>ROLE.DataSteward</roleName><pepres>false</pepres><externalFactor>false</externalFactor><parkeristrator>true</parkeristrator><current>parker</current></externalFactorReturn>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.378965", LAST_UPDATE_DATE="1997-10-10 13:03:58.378", ACTION="externalFactor", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactor><current>parker</current><keywordp><encrypted>true</encrypted><keywordp>******</keywordp></keywordp><boriskhan>boriskhan1-CMX_PRTY</boriskhan></externalFactor>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.374242", LAST_UPDATE_DATE="1997-10-10 13:03:58.373", ACTION="externalFactor", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactor><current>parker</current><keywordp><encrypted>true</encrypted><keywordp>******</keywordp></keywordp><boriskhan>boriskhan1-CMX_PRTY</boriskhan></externalFactor>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.374235", LAST_UPDATE_DATE="1997-10-10 13:03:58.373", ACTION="externalFactor", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactor><current>parker</current><keywordp><encrypted>true</encrypted><keywordp>******</keywordp></keywordp><boriskhan>boriskhan1-CMX_PRTY</boriskhan></externalFactor>"

1997-10-10 15:35:13.046, CREATE_DATE="1997-10-10 13:03:58.325953", LAST_UPDATE_DATE="1997-10-10 13:03:58.325", ACTION="externalFactor.RESPONSE", STATUS="info", DATA_STRING="<?xml version="1.0" encoding="UTF-8"?>
<externalFactorReturn><roleName>ROLE.CustomerManager</roleName><roleName>ROLE.DataSteward</roleName><pepres>false</pepres><externalFactor>false</externalFactor><parkeristrator>true</parkeristrator><current>parker</current></externalFactorReturn>"

 

 @priit @PriA @yonmost @jameshgibson @bnikhil0584 

Tags (5)
0 Karma
1 Solution

sandeepreddy947
Path Finder

I've tried identifying all individual fields in events and extracted using rex.

 

| rex "\s\<externalFactor\>(?<externalFactor>.*)\<\/externalFactor\>" 
| rex  "\s\<externalFactorReturn\>(?<externalFactorReturn>.*)\<\/externalFactorReturn\>"
| rex  "\<current\>(?<current>.*)\<\/current\>" 
| rex  "\<encrypted\>(?<encrypted>.*)\<\/encrypted\>" 
| rex  "\<keywordp\>(?<keywordp>.*)\<\/keywordp\>" 
| rex  "\<pepres\>(?<pepres>.*)\<\/pepres\>" 
| rex  "\<roleName\>(?<roleName>.*)\<\/roleName\>" 
| rex  "\<boriskhan\>(?<boriskhan>.*)\<\/boriskhan\>" 
| rex  "\<sload\>(?<sload>.*)\<\/sload\>" 
| rex  "\<externalFactor\>(?<externalFactor>.*)\<\/externalFactor\>" 
| rex  "\<parkeristrator\>(?<parkeristrator>.*)\<\/parkeristrator\>" 

 

View solution in original post

sandeepreddy947
Path Finder

I've tried identifying all individual fields in events and extracted using rex.

 

| rex "\s\<externalFactor\>(?<externalFactor>.*)\<\/externalFactor\>" 
| rex  "\s\<externalFactorReturn\>(?<externalFactorReturn>.*)\<\/externalFactorReturn\>"
| rex  "\<current\>(?<current>.*)\<\/current\>" 
| rex  "\<encrypted\>(?<encrypted>.*)\<\/encrypted\>" 
| rex  "\<keywordp\>(?<keywordp>.*)\<\/keywordp\>" 
| rex  "\<pepres\>(?<pepres>.*)\<\/pepres\>" 
| rex  "\<roleName\>(?<roleName>.*)\<\/roleName\>" 
| rex  "\<boriskhan\>(?<boriskhan>.*)\<\/boriskhan\>" 
| rex  "\<sload\>(?<sload>.*)\<\/sload\>" 
| rex  "\<externalFactor\>(?<externalFactor>.*)\<\/externalFactor\>" 
| rex  "\<parkeristrator\>(?<parkeristrator>.*)\<\/parkeristrator\>" 

 

PickleRick
SplunkTrust
SplunkTrust

You need to extract special capture groups from each match called _KEY_1 and _VAL_1

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Where in Splunk are you using this regex?  If it's in transforms.conf then please share the full stanza.

---
If this reply helps you, Karma would be appreciated.
0 Karma

sandeepreddy947
Path Finder

I have regex in description and also url from regex101.com. its not working in Splunk when i used rex with SPL query.

index=universe sourcetype=planet
| rex field=_raw "<(?<key>[^>]+)> (?<value>[^<]+)<\/\1>"

Results i got, 

 

key

current
encrypted
keywordp
boriskhan
rolename
.
.


value

parker
true
******
role.customermanager
false
.
.
.

0 Karma

sandeepreddy947
Path Finder

Result i need from raw data is,

current=parker
encrypted=true
keywordp=******
boriskhan=boriskhan1-cmx_prty
rolename=role.customermanager
.
.
.
.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

See if this run-anywhere example query helps.

| makeresults 
| eval data="1997-10-10 15:35:13.046, CREATE_DATE=\"1997-10-10 13:36:22.742479\", LAST_UPDATE_DATE=\"1997-10-10 13:36:22.74\", ACTION=\"externalFactor\", STATUS=\"info\", DATA_STRING=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?><externalFactor><current>parker</current><keywordp><encrypted>true</encrypted><keywordp>******</keywordp></keywordp><boriskhan>boriskhan1-CMX_PRTY</boriskhan></externalFactor>\"
    1997-10-10 15:35:13.046, CREATE_DATE=\"1997-10-10 13:03:58.388887\", LAST_UPDATE_DATE=\"1997-10-10 13:03:58.388\", ACTION=\"externalFactor.RESPONSE\", STATUS=\"info\", DATA_STRING=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?><externalFactorReturn><roleName>ROLE.CustomerManager</roleName><roleName>ROLE.DataSteward</roleName><pepres>false</pepres><externalFactor>false</externalFactor><parkeristrator>true</parkeristrator><current>parker</current></externalFactorReturn>" 
| eval data=split(data,"
") 
| mvexpand data 
| eval _raw=data 
| fields - data 
``` Everything above sets up demo data.  Delete IRL ```
``` Extract keys and values ```
| rex max_match=0 "<(?<key>[^>]+)>(?<value>[^<]+)<\/\1>" 
``` Match keys and values so they stayed paired during mvexpand ```
| eval pairs=mvzip(key,value) 
| mvexpand pairs 
``` Separate key from value ```
| eval pairs=split(pairs,",") 
``` Define key=value result ```
| eval pairs=mvindex(pairs,0) . "=" . mvindex(pairs,1)
| table pairs
---
If this reply helps you, Karma would be appreciated.

sandeepreddy947
Path Finder

Its great, i have all key as in field_header and all valuesas in field_value assigned to a field called paris. So, here all key and value extracted in one field called paris. But, i'm trying to extract field headers and field values from raw data. As to search for SPL query "index=index_name sourcetype=sourcetype |table current, encrypted, keywordp,boriskhan, rolename, prepres, externalFactor, parkeristrator".

 

so, what i got is field_header as paris and field_values as current=parker,  encrypted=true,  .  .  .  . 

paris
current=parker
encrypted=true
keywordp=******
boriskhan=1-CMX_PRTY
roleName=ROLE.CustomerManager
pepres=false
externalFactor=false
parkeristrator=true
current=parker

 

What goal is, field_header_1 is current, field_header_2 is encrypted, field_header_3 is keyword as below and to get field_headers showup in Splunk interesting fields with its values.

currentencryptedkeywordpboriskhanroleNamepepresexternalFactor
parkertrue******1-CMX_PRTYROLE.CustomerManagerfalsefalse

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this query.  The transpose command "rotates" the results table into the desired format.

| makeresults 
| eval data="1997-10-10 15:35:13.046, CREATE_DATE=\"1997-10-10 13:36:22.742479\", LAST_UPDATE_DATE=\"1997-10-10 13:36:22.74\", ACTION=\"externalFactor\", STATUS=\"info\", DATA_STRING=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?><externalFactor><current>parker</current><keywordp><encrypted>true</encrypted><keywordp>******</keywordp></keywordp><boriskhan>boriskhan1-CMX_PRTY</boriskhan></externalFactor>\"
    1997-10-10 15:35:13.046, CREATE_DATE=\"1997-10-10 13:03:58.388887\", LAST_UPDATE_DATE=\"1997-10-10 13:03:58.388\", ACTION=\"externalFactor.RESPONSE\", STATUS=\"info\", DATA_STRING=\"<?xml version=\"1.0\" encoding=\"UTF-8\"?><externalFactorReturn><roleName>ROLE.CustomerManager</roleName><roleName>ROLE.DataSteward</roleName><pepres>false</pepres><externalFactor>false</externalFactor><parkeristrator>true</parkeristrator><current>parker</current></externalFactorReturn>" 
| eval data=split(data,"
") 
| mvexpand data 
| eval _raw=data 
| fields - data 
``` Everything above sets up demo data.  Delete IRL ```
``` Extract keys and values ```
| rex max_match=0 "<(?<key>[^>]+)>(?<value>[^<]+)<\/\1>" 
``` Match keys and values so they stayed paired during mvexpand ```
| eval pairs=mvzip(key,value) 
| mvexpand pairs 
``` Separate key from value ```
| eval pairs=split(pairs,",") 
``` Define key=value result ```
| eval key=mvindex(pairs,0), value=mvindex(pairs,1)
| fields key,value
| fields - _*
| transpose 0 header_field=key
---
If this reply helps you, Karma would be appreciated.

sandeepreddy947
Path Finder

But, i'm still not able to view extracted field_header properly in interesting fields. i can only view "key" and "value"  in interesting fields. I'm searching in verbose mode.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Only the key and value fields are visible because that is what the fields command does.  If you prefer different names then change "key" to "field_header" throughout the query.  Likewise for "value".

---
If this reply helps you, Karma would be appreciated.
0 Karma

sandeepreddy947
Path Finder

i'm trying to find right regex for Splunk search time extraction first, via Splunk GUI.

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...