Splunk Search

How to get the required text from rex?

saitejagayala
New Member

Hello,
I want to extract only the required text from Logs using rex.

for instance,
consider in logs there is some data in tags i.e

<ID> 100034566 </ID> <data> This consists of DB data </data> <date> the date is 04-03-2019 </data>..........etc

The regular expression which I am using is

  index = * | rex field=Msg "<data>(?<error>.*)" | table error

The output which I am getting is

error
This consists of DB data </data> <date> the date is 04-03-2019 </date>..........etc

What I need is only the data which is present in tag . i.e

REQUIRED OUTPUT

 error
This consists of DB data

But, The data which is suffix to that is also getting displayed, which I don't need.

Can anyone help me out in this?

0 Karma
1 Solution

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

View solution in original post

0 Karma

FrankVl
Ultra Champion

This should probably work:

| rex field=Msg "\<data\>(?<error>[^<]+)"

https://regex101.com/r/tpYcTu/1
If your data indeed contains whitespace around the tags, you can strip that off using | eval data=trim(data) after the rex command (can also be done by using a more complex regex).

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Please try below regex, that regex will extract output in new field called ext_data

<yourBaseSearch>
| rex field=_raw "\<data\>\s?(?<ext_data>[^\<]*)"

EDIT: Updated regex because I found space after <data>

0 Karma

saitejagayala
New Member

Hi @harsmarvania57
Can you elaborate and explain the rex which you wrote?

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Yes, I'll try my best to explain, from regex

  1. \<data\> is literally matching <data> from your raw data
  2. \s? will find white space after <data> for zero or one time
  3. (?<ext_data>[^\<]*) will find all character before < and store that extracted data in new field called ext_data
0 Karma

FrankVl
Ultra Champion

Please post your current regex also as code (like you did with the sample data). Otherwise some special characters disappear.

0 Karma

FrankVl
Ultra Champion

Thanks for editing your question, the reason you're getting everything after the data tag, is because you use .*, which matches anything. Have a look at the answers below for more strict regular expressions that stop at the < character.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...