Splunk Search

How to extract message fields from logs with rex?

splunksogetiht
Explorer

Hi all,

I want to extract data from a log which is like that :

2014-21-08 07:10:57,603.812  - DEBUG- (pid: 12727 tid: 13196) blablabla_message_1
2014-21-08 07:10:27,118.983  - DEBUG- blablabla_message_2

I dont want the "pid" and "tid" value but i would like that blablabla_message_1 and blablabla_message_2 are in the same field.

If i do this :

| rex ".+- (?<GCM_type_name>[A-Z]+)\s*- \(pid\: [0-9]+ tid\: [0-9]+\)(?<GCM_message>.+)"

I get blablabla_message_1 but not blablabla_message_2

And if i do just this :

| rex ".+- (?<type_name>[A-Z]+)\s*- (?<message>.+)"

I get blablabla_message_2 but also "(pid: 12727 tid: 13196) blablabla_message_1" 😞

Is it possible ?

1 Solution

martin_mueller
SplunkTrust
SplunkTrust

You could do something like this:

... | rex "\s-\s+(?<log_level>[A-Z]+)\s*-\s+(?:\([^)]*\))?\s*(?<message>.+)"

That'll cut out the first parenthesized group at the beginning if there is one, and proceeds as normal if there isn't one.
This assumes that there is no parentheses within that parenthesized group, and that the message itself cannot start with a parenthesized group that's part of the message.

If one of those assumptions doesn't hold then I'm sure the regex could be made more complicated to allow for your reality.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

You could do something like this:

... | rex "\s-\s+(?<log_level>[A-Z]+)\s*-\s+(?:\([^)]*\))?\s*(?<message>.+)"

That'll cut out the first parenthesized group at the beginning if there is one, and proceeds as normal if there isn't one.
This assumes that there is no parentheses within that parenthesized group, and that the message itself cannot start with a parenthesized group that's part of the message.

If one of those assumptions doesn't hold then I'm sure the regex could be made more complicated to allow for your reality.

martin_mueller
SplunkTrust
SplunkTrust

Sure.

\( looks for a literal opening parenthesis
[^)]* looks for zero or more characters that aren't a closing parenthesis
\) looks for a literal closing parenthesis
(?:...) wraps all of the above in a noncapturing group
? at the end makes that noncapturing group optional, appearing zero or one time

martin_mueller
SplunkTrust
SplunkTrust

For interactive explanations you can use http://regexr.com/ - just make sure not to use named capturing groups, those aren't supported there.

splunksogetiht
Explorer

Ohhhh, ok ! Nice 🙂

Really thank you, it's working and I understand better regex now 🙂

splunksogetiht
Explorer

Ok, thank you, I will try it.

But can you explain me :

(?:\([^)]*\))?

My incomprehension is probably for the "?" or the "^" in this sequence

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...