Splunk Search

How to extract a field value using Regex with Field Extractor

ezmo1982
Path Finder

Hi

I am trying to use Regex with the Field Extractor to extract the value of a particular field in a given piece of text, but am having a problem with the regex.

The text is in the format " text | message: value | more text ". So basically i need to extract the value of the field 'message' , and put it into a field named raw_message. The value of the message field can be any string.

Each field/value pair in the text is separated by a pipe character, as can be seen below. I want to just extract the value of the 'message' field. All other text can be ignored. The ":" character that proceeds the field name can be ignored also.

Sample text below:

 

 

 

 

| source: 10.2.2.134 | message: P-235332 | host: clmm0011.syn.local

 

 

 

 

So Regex needs to extract "P-235332" into a new field named raw_message.

Can somebody help me with a Regex that would work with this?

Thanks.

Labels (3)
0 Karma
1 Solution

moliminous
Path Finder

Yes, for that you could use the regex of . to grab any character, + tells it 1 or more matches, the ? makes it lazy so it doesn't attempt to grab everything to the end, then outside of the named capture group we show it the characters that appear after the field value we want, which in this case is a space, \s, and a pipe | character.

For the pipe | character, we have to escape it since it means something else in regex, so we put a backslash \ before it.

The end result is this regex, which should work for you:

message:\s*(?<raw_message>.+?)\s\|

View solution in original post

ezmo1982
Path Finder

Apologies, I should have mentioned that there is a possibility that the value can have space characters in it. So the regex you supplied only matches the text before a space appears.

Another sample text below. So in this example, the regex would need to capture "P-235332 55 clm". So would need to capture everything before the next pipe character.

| source: 10.2.2.134 | message: P-235332 55 clm | host: clmm0011.syn.local

Can you provide updated SPL for the above?

0 Karma

moliminous
Path Finder

Yes, for that you could use the regex of . to grab any character, + tells it 1 or more matches, the ? makes it lazy so it doesn't attempt to grab everything to the end, then outside of the named capture group we show it the characters that appear after the field value we want, which in this case is a space, \s, and a pipe | character.

For the pipe | character, we have to escape it since it means something else in regex, so we put a backslash \ before it.

The end result is this regex, which should work for you:

message:\s*(?<raw_message>.+?)\s\|

moliminous
Path Finder

You will need a named capture group.

If there is a space after the colon, or you're not sure, use this:

message:\s*(?<raw_message>\S+)

If there is always a space after the colon,  you could just use this below.
The asterisk allow zero or more spaces (/s). You can learn more at regex101(dot)com or other sources.

message: (?<raw_message>\S+)

It's essential the same as the other person posted, but they were missing the ? for the named capture group and you don't really need anything before 'message' in this case. 

0 Karma

yuanliu
SplunkTrust
SplunkTrust

A usable regex would be "| message: (<raw_message>\S+)".

0 Karma
Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Enhance Security Operations with Automated Threat Analysis in the Splunk EcosystemAre you leveraging ...

Splunk Developers: Go Beyond the Dashboard with These .Conf25 Sessions

  Whether you’re building custom apps, diving into SPL2, or integrating AI and machine learning into your ...

Index This | How do you write 23 only using the number 2?

July 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this month’s ...