Splunk Search

Extracting fields from slightly different tab delimited logs

MScottFoley
Path Finder

I have two slightly different forms of a tab delimited log.  Both are in the same index and have the same source type.  One has a leading number, and the other does not.   How can I extract a single field name that looks at column 10 if there is a leading number and column 9 if not.   

Log with a leading number
1650556427.891  98.53.183.43  0.001  200  1560  GET  https ... DEN50-C1 PVnGZrUUkw0RcRcqs4 ...

Log without a leading number
98.53.183.43  0.001  200  1560  GET  https ...  LAX50-C4 ht6GZrUdg5tRcRcq34 ...

I can't just look for field 10 because it will only work in one type of log and return the wrong information in the other.    

I made a RegEx query that picked the field position based of whether there was a leading number or not.  The problem is that this does not work because the two subpattern names are the same.   
Splunk Error:  Regex: two named subpatterns have the same name (PCRE2_DUPNAMES not set).

(?(?=^\d+\.\d+\s)^(?:[^\t\n]*\t){10}(?P<fieldName>[^\t]+)|^(?:[^\t\n]*\t){9}(?P<fieldName>[^\t]+))

If I change the 2nd field name it saves, but only the first name is shown as a fieldName and the entry without a leading number is not included in the fieldName.

Is there a RegEx that can do this, or some another way without changing the log?  I think if I was able to split the two log types into different source types I could do it easily.  I don't think I can do that though.  The logs come from AWS cloud servers.  The same with removing the leading number. 

Thanks for your help.    

 

 

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Rather than anchoring to the beginning of the line, anchor to the ip address

(\d+\.){3}\d+\t(?:[^\t\n]*\t){8}(?P<fieldName>[^\t]+)

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Rather than anchoring to the beginning of the line, anchor to the ip address

(\d+\.){3}\d+\t(?:[^\t\n]*\t){8}(?P<fieldName>[^\t]+)
0 Karma

MScottFoley
Path Finder

That is a great idea.   Thanks.    

0 Karma

VatsalJagani
SplunkTrust
SplunkTrust

@MScottFoley - you have two options:

  1. Instead of using REGEX OR (| - pipe sign), you use two different regexes.
    1. This works whether you are using the rex command, EXTRACT or REPORT in props.conf.
  2. Instead of using fieldName in both places in a regex, you use fieldName1 and fieldName2.
    1. Then you can apply the following eval.
    2. fieldName = coalesce(fieldName1, fieldName2)

Both will work whether you are doing this with an SPL query or under your props/transforms.conf configuration.

 

I hope this helps!!!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...