Solved: Regex Field extraction question

Bliide · ‎07-02-2014

Hello,

I am trying to extract a field and I have an error in my REGEX. The line looks like this:

6/26/2014 13:00:10.866 | 18636 | Cmd:ALARMS device:REED31_13HLOALP_RD, status:2:FAILED: Unresolved message: ALARMS, user:OKCLOI.MSS, parms: | UisCmdRequestManagerImpl.cpp | 747 | nLOG_EXCEPTIONS

I am trying to pull the 5th section out of the line. The extracted data in this log example would be 747. I have this REGEX in my field extraction:

(?i)^[^\|][^\|][^\|][^\|] (?P{FIELDNAME}\s\d+)

What have I done wrong? Is the data in the third piped section not available for an "any"? Do I need to break down that section?

Lowell · ‎07-02-2014

There are a couple minor issues with the regex as writen:

It doesn't account for how many times a character repeats.
It doesn't look for both a pipe and a non-pipe character (it only looks for non-pipes)
The value of fieldname will have a leading space based on the placement of \s

I like richalloway's approach, but I'm not sure the [\S\s] is quote what you want. I believe that would be interpreted to be a character range that includes all non-spaces and all spaces (which pretty much includes everything, which could be written as simple ".") Also, this should be anchored to the beginning of the line (^).

Here's my suggestion: (Modified from richalloway's answer)

^(?:[^|]+\|){4}\s*(?<fieldname>\d+)

Basically this means, from the start of the line, look for one or more character that's not a pipe, followed by a single pipe. (Repeat 4 times; thus putting us into field 5). Skip over any whitespace characters, and capture the following digits into a field named "fieldname".

Of course, delimiter based field extractions are also another option using props.conf and transforms.conf.

View solution in original post

Lowell · ‎07-02-2014

There are a couple minor issues with the regex as writen:

It doesn't account for how many times a character repeats.
It doesn't look for both a pipe and a non-pipe character (it only looks for non-pipes)
The value of fieldname will have a leading space based on the placement of \s

I like richalloway's approach, but I'm not sure the [\S\s] is quote what you want. I believe that would be interpreted to be a character range that includes all non-spaces and all spaces (which pretty much includes everything, which could be written as simple ".") Also, this should be anchored to the beginning of the line (^).

Here's my suggestion: (Modified from richalloway's answer)

^(?:[^|]+\|){4}\s*(?<fieldname>\d+)

Basically this means, from the start of the line, look for one or more character that's not a pipe, followed by a single pipe. (Repeat 4 times; thus putting us into field 5). Skip over any whitespace characters, and capture the following digits into a field named "fieldname".

Of course, delimiter based field extractions are also another option using props.conf and transforms.conf.

Lowell · ‎07-03-2014

Don't forget to mark your question resolved by selecting the check mark next to one of the answers.

Bliide · ‎07-02-2014

This worked great. I did not get a chance to try out the first two suggestions. When I looked at the answers these three were already posted so I of course took the one that referenced others. My REGEX is weak and I thank you all very much for the answers.

Richfez · ‎07-02-2014

There's an eval function that does this.

eval temp=split(_raw,"|") | eval FieldX=mvindex(temp,4)

The first eval splits your _raw into a multivalue field split by the pipe symbol, the second then pulls out the 4th of those fields, calling it FieldX.

Obviously, rename as desired.

richgalloway · ‎07-02-2014

This worked for me.

(?:[\S\s]*|){4}\s(?<fieldname>\d+)

---
If this reply helps you, Karma would be appreciated.

Regex Field extraction question

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Automated Threat Analysis: Available in ES Premier

What’s New in Splunk AI: Volume 02

Best Practices: Splunk auto adjust pipeline queue

Join the Conversation