Solved: How to get the required text from rex?

saitejagayala · ‎04-03-2019

Hello,
I want to extract only the required text from Logs using rex.

for instance,
consider in logs there is some data in tags i.e

<ID> 100034566 </ID> <data> This consists of DB data </data> <date> the date is 04-03-2019 </data>..........etc

The regular expression which I am using is

  index = * | rex field=Msg "<data>(?<error>.*)" | table error

The output which I am getting is

error
This consists of DB data </data> <date> the date is 04-03-2019 </date>..........etc

What I need is only the data which is present in tag . i.e

REQUIRED OUTPUT

 error
This consists of DB data

But, The data which is suffix to that is also getting displayed, which I don't need.

Can anyone help me out in this?

harsmarvania57 · ‎04-03-2019

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

View solution in original post

FrankVl · ‎04-03-2019

This should probably work:

| rex field=Msg "\<data\>(?<error>[^<]+)"

https://regex101.com/r/tpYcTu/1
If your data indeed contains whitespace around the tags, you can strip that off using | eval data=trim(data) after the rex command (can also be done by using a more complex regex).

harsmarvania57 · ‎04-03-2019

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

saitejagayala · ‎04-03-2019

Hi @harsmarvania57
Can you elaborate and explain the rex which you wrote?

harsmarvania57 · ‎04-03-2019

Yes, I'll try my best to explain, from regex

\<data\> is literally matching <data> from your raw data
\s? will find white space after <data> for zero or one time
(?<ext_data>[^\<]*) will find all character before < and store that extracted data in new field called ext_data

FrankVl · ‎04-03-2019

Please post your current regex also as code (like you did with the sample data). Otherwise some special characters disappear.

FrankVl · ‎04-03-2019

Thanks for editing your question, the reason you're getting everything after the data tag, is because you use .*, which matches anything. Have a look at the answers below for more strict regular expressions that stop at the < character.

How to get the required text from rex?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

How to get the required text from rex?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard