Splunk Search

Simple regex for capturing text between strings with different end anchors

Cuyose
Builder

I've been battling this, and I'm not sure if it's a bug in Splunk or what. This is for a field extraction.

I simply need to grab all text between the following character strings and assign a field name.

Here is an example event snippet:

Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies. - aa5f6710-baa5-49c1-8efa-96c3b13a4cbf

I need to capture everything between Exception= and \n OR . - GUID OR :

Tags (2)
0 Karma
1 Solution

woodcock
Esteemed Legend

Like this:

... | rex "(?ms)Exception=(?<MyCapture>.[^\r\n:]+?)(?:[\r\n]|:|\.?\s+-\s+\w{8}-\w{4}-\w{4}-\w{4}-\w{12}|$)"

View solution in original post

0 Karma

woodcock
Esteemed Legend

Like this:

... | rex "(?ms)Exception=(?<MyCapture>.[^\r\n:]+?)(?:[\r\n]|:|\.?\s+-\s+\w{8}-\w{4}-\w{4}-\w{4}-\w{12}|$)"
0 Karma

Cuyose
Builder

This is awesome, thanks! I can use this to deconstruct the syntax for other variables. I was working from a lot of documentation on regex, and I swear was doing things as documented and having crap luck. I really need to sit down and take an in depth refresher on regex.

0 Karma

Cuyose
Builder

This seems close but still contains the GUIDS

0 Karma

woodcock
Esteemed Legend

Show me non-conforming data and I can adjust.

0 Karma

Cuyose
Builder

Exception=BAD_EXTERNAL_DATA - VOYAGER - Los datos indicados por el sistema externo no son los esperados - aa39147e-2cdb-47d8-a167-7175eff6496a

0 Karma

woodcock
Esteemed Legend

You said OR . - GUID and this example does not have a period. I made the period optional and updated my original answer. It should work for both cases now.

0 Karma

somesoni2
Revered Legend

Try something like this

your base search| rex field=_raw "Exception=(?<Message>.+)(\n|:|\.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})"

Run anywhere sample with all three cases

| gentimes start=-1 | eval _raw="Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies. - aa5f6710-baa5-49c1-8efa-96c3b13a4cbf" | table _raw | append [| gentimes start=-1 | eval _raw="Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies
dfd. - aa5f6710-baa5-49c1-8efa-96c3b13a4cbf" | table _raw ]| append [| gentimes start=-1 | eval _raw="Exception=12567 - INSURANCE_BOOKING - Sorry we are unable to cancel your Insurance as your coverage has already started, please refer to our Terms and conditions for cancellation policies: additional text for test"  | table _raw]| rex field=_raw "Exception=(?<Message>.+)(\n|:|\.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})"
0 Karma

Cuyose
Builder

How would this look in a field extraction transform? It does not seem to work when declared
(?i)Exception=(?.+)(\n|:|.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})

0 Karma

somesoni2
Revered Legend

Not sure if you'd need a transform.conf for this. You just put it in props.conf as EXTRACT

[yoursourcetype]
EXTRACT-message = Exception=(?<Message>.+)(\n|:|\.\s+-\s\w{8}-\w{4}-\w{4}-\w{4}-\w{12})

OR from Splunk web, Fields-> Fields Extraction

0 Karma

Cuyose
Builder

This unfortunately does not break upon reaching any of the end anchors, but rather assigns all text to end of the event to "Message"

0 Karma

somesoni2
Revered Legend

Could you try this

 EXTRACT-message = Exception=(?<Message>.+)(:|(\.\s+-\s+\w{8}-\w{4}-\w{4}-\w{4}-\w{12})|[\r\n])
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...