Splunk Search

How to get the required text from rex?

saitejagayala
New Member

Hello,
I want to extract only the required text from Logs using rex.

for instance,
consider in logs there is some data in tags i.e

<ID> 100034566 </ID> <data> This consists of DB data </data> <date> the date is 04-03-2019 </data>..........etc

The regular expression which I am using is

  index = * | rex field=Msg "<data>(?<error>.*)" | table error

The output which I am getting is

error
This consists of DB data </data> <date> the date is 04-03-2019 </date>..........etc

What I need is only the data which is present in tag . i.e

REQUIRED OUTPUT

 error
This consists of DB data

But, The data which is suffix to that is also getting displayed, which I don't need.

Can anyone help me out in this?

0 Karma
1 Solution

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

View solution in original post

0 Karma

FrankVl
Ultra Champion

This should probably work:

| rex field=Msg "\<data\>(?<error>[^<]+)"

https://regex101.com/r/tpYcTu/1
If your data indeed contains whitespace around the tags, you can strip that off using | eval data=trim(data) after the rex command (can also be done by using a more complex regex).

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

0 Karma

saitejagayala
New Member

Hi @harsmarvania57
Can you elaborate and explain the rex which you wrote?

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Yes, I'll try my best to explain, from regex

  1. \<data\> is literally matching <data> from your raw data
  2. \s? will find white space after <data> for zero or one time
  3. (?<ext_data>[^\<]*) will find all character before < and store that extracted data in new field called ext_data
0 Karma

FrankVl
Ultra Champion

Please post your current regex also as code (like you did with the sample data). Otherwise some special characters disappear.

0 Karma

FrankVl
Ultra Champion

Thanks for editing your question, the reason you're getting everything after the data tag, is because you use .*, which matches anything. Have a look at the answers below for more strict regular expressions that stop at the < character.

0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out &gt;&gt; As our brave ...