Splunk Search

Regex Field extraction question

Bliide
Path Finder

Hello,

I am trying to extract a field and I have an error in my REGEX. The line looks like this:

6/26/2014 13:00:10.866 | 18636 | Cmd:ALARMS device:REED31_13HLOALP_RD, status:2:FAILED: Unresolved message: ALARMS, user:OKCLOI.MSS, parms: | UisCmdRequestManagerImpl.cpp | 747 | nLOG_EXCEPTIONS

I am trying to pull the 5th section out of the line. The extracted data in this log example would be 747. I have this REGEX in my field extraction:

(?i)^[^\|][^\|][^\|][^\|] (?P{FIELDNAME}\s\d+)

What have I done wrong? Is the data in the third piped section not available for an "any"? Do I need to break down that section?

Tags (2)
1 Solution

Lowell
Super Champion

There are a couple minor issues with the regex as writen:

  1. It doesn't account for how many times a character repeats.
  2. It doesn't look for both a pipe and a non-pipe character (it only looks for non-pipes)
  3. The value of fieldname will have a leading space based on the placement of \s

I like richalloway's approach, but I'm not sure the [\S\s] is quote what you want. I believe that would be interpreted to be a character range that includes all non-spaces and all spaces (which pretty much includes everything, which could be written as simple ".") Also, this should be anchored to the beginning of the line (^).

Here's my suggestion: (Modified from richalloway's answer)

^(?:[^|]+\|){4}\s*(?<fieldname>\d+)

Basically this means, from the start of the line, look for one or more character that's not a pipe, followed by a single pipe. (Repeat 4 times; thus putting us into field 5). Skip over any whitespace characters, and capture the following digits into a field named "fieldname".

Of course, delimiter based field extractions are also another option using props.conf and transforms.conf.


View solution in original post

Lowell
Super Champion

There are a couple minor issues with the regex as writen:

  1. It doesn't account for how many times a character repeats.
  2. It doesn't look for both a pipe and a non-pipe character (it only looks for non-pipes)
  3. The value of fieldname will have a leading space based on the placement of \s

I like richalloway's approach, but I'm not sure the [\S\s] is quote what you want. I believe that would be interpreted to be a character range that includes all non-spaces and all spaces (which pretty much includes everything, which could be written as simple ".") Also, this should be anchored to the beginning of the line (^).

Here's my suggestion: (Modified from richalloway's answer)

^(?:[^|]+\|){4}\s*(?<fieldname>\d+)

Basically this means, from the start of the line, look for one or more character that's not a pipe, followed by a single pipe. (Repeat 4 times; thus putting us into field 5). Skip over any whitespace characters, and capture the following digits into a field named "fieldname".

Of course, delimiter based field extractions are also another option using props.conf and transforms.conf.


View solution in original post

Lowell
Super Champion

Don't forget to mark your question resolved by selecting the check mark next to one of the answers.

0 Karma

Bliide
Path Finder

This worked great. I did not get a chance to try out the first two suggestions. When I looked at the answers these three were already posted so I of course took the one that referenced others. My REGEX is weak and I thank you all very much for the answers.

0 Karma

Richfez
SplunkTrust
SplunkTrust

There's an eval function that does this.

eval temp=split(_raw,"|") | eval FieldX=mvindex(temp,4)

The first eval splits your _raw into a multivalue field split by the pipe symbol, the second then pulls out the 4th of those fields, calling it FieldX.

Obviously, rename as desired.

richgalloway
SplunkTrust
SplunkTrust

This worked for me.

(?:[\S\s]*|){4}\s(?<fieldname>\d+)
---
If this reply helps you, an upvote would be appreciated.
0 Karma