Splunk Search

rex not finding the end of a string and the same rex works in other applications

_jgpm_
Communicator

I'm on 6.4.3. I'm trying to template a text parser in Splunk that will basically delimit sentences in many different use cases. If there is a better way of doing this, please let me know.

As far as I can tell, this is a rex issue specific to Splunk. I use regex101.com to proof my regex before using them in Splunk. This works almost 100% of the time. Here is one of the edge cases that I can't figure out.

This is the _raw:
Wi-Fi delivery to cars remains a major target application for European car makers, in spite of the regulatory challenges in Europe.
Xxxxx Xxxxxx was demonstrating its solution already available in 400,000 Xxxx vehicles shipped in Europe. Among other applications:

This is the rex expression:
| rex field=_raw "(\x{F0B7} )?(?P<encode>[A-Z].+?[.])[\x22”]?[ ](?P<decode>[A-Z].+)" |

This is encode:
Wi-Fi delivery to cars remains a major target application for European car makers, in spite of the regulatory challenges in Europe.

This is decode:
Xxxxx Xxxxxx was demonstrating its solution already available in 400,000 Xxxx vehicles shipped in Europe.

I can't get the last Among other applications: to appear in decode. I've tried adding $, replacing the .+ with explicit characters. Almost all attempts result in encode capturing the whole _raw and decode being null.

I don't want to just drop the last bit of text, I want to capture 'em all. Can someone help me out with the regex before I pull out my hair?

Thanks.

0 Karma
1 Solution

koshyk
Super Champion

Please have a try below regex

rex field=_raw "(\x{F0B7} )?(?P<encode>[A-Z].+?[.])[\x22”]?[ ](?P<decode>[\w\W]+)"

Example with complete value

 | makeresults | eval key="Wi-Fi delivery to cars remains a major target application for European car makers, in spite of the regulatory challenges in Europe. Xxxxx Xxxxxx was demonstrating its solution already available in 400,000 Xxxx vehicles shipped in Europe. Among other applications:" |  rex field=key "(\x{F0B7} )?(?P<encode>[A-Z].+?[.])[\x22”]?[ ](?P<decode>[\w\W]+)" | table encode, decode

View solution in original post

woodcock
Esteemed Legend

I suspect the problem is embedded newlines or unexpected extra whitespace so try this:

| rex "(?ms)(\x{F0B7}\s+)?(?<encode>[A-Z][^\.]*\.)[\x22”]?\s+(?<decode>[A-Z].+)"

_jgpm_
Communicator

worked as well. I reduced it to this

(?ms)(\x{F0B7})?(?P<encode2>[A-Z][^\.]*\.)[\x22”]? (?P<decode2>[A-Z].+)

which worked. Bonus points for showing me how to use inline regex flags within the expression.

0 Karma

koshyk
Super Champion

Please have a try below regex

rex field=_raw "(\x{F0B7} )?(?P<encode>[A-Z].+?[.])[\x22”]?[ ](?P<decode>[\w\W]+)"

Example with complete value

 | makeresults | eval key="Wi-Fi delivery to cars remains a major target application for European car makers, in spite of the regulatory challenges in Europe. Xxxxx Xxxxxx was demonstrating its solution already available in 400,000 Xxxx vehicles shipped in Europe. Among other applications:" |  rex field=key "(\x{F0B7} )?(?P<encode>[A-Z].+?[.])[\x22”]?[ ](?P<decode>[\w\W]+)" | table encode, decode

_jgpm_
Communicator

worked with the fewest changes.

0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...