I have two slightly different forms of a tab delimited log. Both are in the same index and have the same source type. One has a leading number, and the other does not. How can I extract a single field name that looks at column 10 if there is a leading number and column 9 if not.
Log with a leading number
1650556427.891 98.53.183.43 0.001 200 1560 GET https ... DEN50-C1 PVnGZrUUkw0RcRcqs4 ...
Log without a leading number
98.53.183.43 0.001 200 1560 GET https ... LAX50-C4 ht6GZrUdg5tRcRcq34 ...
I can't just look for field 10 because it will only work in one type of log and return the wrong information in the other.
I made a RegEx query that picked the field position based of whether there was a leading number or not. The problem is that this does not work because the two subpattern names are the same.
Splunk Error: Regex: two named subpatterns have the same name (PCRE2_DUPNAMES not set).
(?(?=^\d+\.\d+\s)^(?:[^\t\n]*\t){10}(?P<fieldName>[^\t]+)|^(?:[^\t\n]*\t){9}(?P<fieldName>[^\t]+))
If I change the 2nd field name it saves, but only the first name is shown as a fieldName and the entry without a leading number is not included in the fieldName.
Is there a RegEx that can do this, or some another way without changing the log? I think if I was able to split the two log types into different source types I could do it easily. I don't think I can do that though. The logs come from AWS cloud servers. The same with removing the leading number.
Thanks for your help.
Rather than anchoring to the beginning of the line, anchor to the ip address
(\d+\.){3}\d+\t(?:[^\t\n]*\t){8}(?P<fieldName>[^\t]+)
Rather than anchoring to the beginning of the line, anchor to the ip address
(\d+\.){3}\d+\t(?:[^\t\n]*\t){8}(?P<fieldName>[^\t]+)
That is a great idea. Thanks.
@MScottFoley - you have two options:
Both will work whether you are doing this with an SPL query or under your props/transforms.conf configuration.
I hope this helps!!!