Getting Data In

How to handle LINE_BREAKER regex for multiple capture groups? Specifically now that we are getting both ip4 and ip6 logs?

briancronrath
Contributor

In the past we had an easy LINE_BREAKER regex that broke on newlines where an ip4 was present ([\r\n]+)\d+.\d+.\d+.\d+

Now we have some logs with ip6 in addition to ip4 being logged, so I was hoping I can just do this via piping it out to alternate capture groups depending on which ip it matches:

([\r\n]+)(\d+.\d+.\d+.\d+|(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]).){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]).){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])))

Is there something present where splunk only expects one capture group to be here for the LINE_BREAKER regex? I'm wondering how we can handle linebreakers now that we have 2 different style of IP that can come in.

0 Karma
1 Solution

gjanders
SplunkTrust
SplunkTrust

Reading the LINE_BREAKER documentation I'm wondering if it's something to do with the parentheses around the regex match after the ([\r\n]+)

As per the props.conf documentation it says:

Example 1:  LINE_BREAKER = end(\n)begin|end2(\n)begin2|begin3

  * A line ending with 'end' followed a line beginning with 'begin' would
    match the first branch, and the first capturing group would have a match
    according to rule 1.  That particular newline would become a break
    between lines.

So I'm assuming you probably don't want to have the various (), also you could probably simplify it to match part of the IP address, unless you often have lines that look similar, normally I would match the first few parts of the IP address or similar...

Example:

([\r\n]+)\d+\.\d+\.\d+\.\d+|([\r\n]+)[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}

Perhaps?
If this works I'll convert it to an answer...

View solution in original post

gjanders
SplunkTrust
SplunkTrust

Reading the LINE_BREAKER documentation I'm wondering if it's something to do with the parentheses around the regex match after the ([\r\n]+)

As per the props.conf documentation it says:

Example 1:  LINE_BREAKER = end(\n)begin|end2(\n)begin2|begin3

  * A line ending with 'end' followed a line beginning with 'begin' would
    match the first branch, and the first capturing group would have a match
    according to rule 1.  That particular newline would become a break
    between lines.

So I'm assuming you probably don't want to have the various (), also you could probably simplify it to match part of the IP address, unless you often have lines that look similar, normally I would match the first few parts of the IP address or similar...

Example:

([\r\n]+)\d+\.\d+\.\d+\.\d+|([\r\n]+)[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}:[0-9a-fA-F]{1,4}

Perhaps?
If this works I'll convert it to an answer...

briancronrath
Contributor

Thanks gareth, feel free to convert to answer and I will mark it as solved!

gjanders
SplunkTrust
SplunkTrust
0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Event Series: Telemetry Pipeline Management

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...