Splunk Search

How to get the required text from rex?

saitejagayala
New Member

Hello,
I want to extract only the required text from Logs using rex.

for instance,
consider in logs there is some data in tags i.e

<ID> 100034566 </ID> <data> This consists of DB data </data> <date> the date is 04-03-2019 </data>..........etc

The regular expression which I am using is

  index = * | rex field=Msg "<data>(?<error>.*)" | table error

The output which I am getting is

error
This consists of DB data </data> <date> the date is 04-03-2019 </date>..........etc

What I need is only the data which is present in tag . i.e

REQUIRED OUTPUT

 error
This consists of DB data

But, The data which is suffix to that is also getting displayed, which I don't need.

Can anyone help me out in this?

0 Karma
1 Solution

harsmarvania57
Ultra Champion

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

View solution in original post

0 Karma

FrankVl
Ultra Champion

This should probably work:

| rex field=Msg "\<data\>(?<error>[^<]+)"

https://regex101.com/r/tpYcTu/1
If your data indeed contains whitespace around the tags, you can strip that off using | eval data=trim(data) after the rex command (can also be done by using a more complex regex).

0 Karma

harsmarvania57
Ultra Champion

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

0 Karma

saitejagayala
New Member

Hi @harsmarvania57
Can you elaborate and explain the rex which you wrote?

0 Karma

harsmarvania57
Ultra Champion

Yes, I'll try my best to explain, from regex

  1. \<data\> is literally matching <data> from your raw data
  2. \s? will find white space after <data> for zero or one time
  3. (?<ext_data>[^\<]*) will find all character before < and store that extracted data in new field called ext_data
0 Karma

FrankVl
Ultra Champion

Please post your current regex also as code (like you did with the sample data). Otherwise some special characters disappear.

0 Karma

FrankVl
Ultra Champion

Thanks for editing your question, the reason you're getting everything after the data tag, is because you use .*, which matches anything. Have a look at the answers below for more strict regular expressions that stop at the < character.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...