Solved: Re: Can you help me fix my regex to event break a ...

tb5821 · ‎02-11-2019

I have a multiline file that I'm trying to get Splunk to understand... note that I'm not using the .conf files, but relying on the add new data UI within Splunk to help...

geo {
id: 0
internal_name: "TEST"
type: LIST
zip: 7
description: "TEST"
}
geo {
 id: 1
internal_name: "TEST"
 type: LIST
zip: 5
 description: "TEST"
}
geo {
id: 2
internal_name: "TEST"
type: LIST
zip: 1
description: "TEST"
}
geo {
id: 3
internal_name: "TEST"
type: LIST
zip: 2
description: "TEST"
}

I've got this regex working as PCRE to break things up into events, but when I use that as the line breaker regex in Splunk, it just spits out one massive event....

(^geo \{(?s).*?\})

What am I doing wrong?

chrisyounger · ‎02-11-2019

The capturing group in LINE_BREAKER should be set to the value that should be deleted, not to what should be captured. Try this setting

LINE_BREAKER = ([\r\n]+)\s*geo\s{

All the best.

View solution in original post

chrisyounger · ‎02-11-2019

The capturing group in LINE_BREAKER should be set to the value that should be deleted, not to what should be captured. Try this setting

LINE_BREAKER = ([\r\n]+)\s*geo\s{

All the best.

ddrillic · ‎02-11-2019

Absolutely @chrisyoungerjds as props.conf

says -

LINE_BREAKER =
* Specifies a regex that determines how the raw text stream is broken into
initial events, before line merging takes place. (See the SHOULD_LINEMERGE
setting, below)
* Defaults to ([\r\n]+), meaning data is broken into an event for each line,
delimited by any number of carriage return or newline characters.
* The regex must contain a capturing group -- a pair of parentheses which
defines an identified subcomponent of the match.
* Wherever the regex matches, Splunk software considers the start of the first
capturing group to be the end of the previous event, and considers the end
of the first capturing group to be the start of the next event.
* The contents of the first capturing group are discarded, and will not be
present in any event. You are telling Splunk software that this text comes
between lines.
* NOTE: You get a significant boost to processing speed when you use
LINE_BREAKER to delimit multi-line events (as opposed to using
SHOULD_LINEMERGE to reassemble individual lines into multi-line events).
* When using LINE_BREAKER to delimit events, SHOULD_LINEMERGE should be set
to false, to ensure no further combination of delimited events occurs.
* Using LINE_BREAKER to delimit events is discussed in more detail in the
documentation. Search the documentation for "configure event line breaking" for details.

tb5821 · ‎02-11-2019

Thanks - this worked out and I think better than ‘break only before’ —- one more question that line that says zip: 0 actually has multiple zip:values all on that one line per event - I wrote another regex which should extract all those values but it only gets the first! Thoughts?

woodcock · ‎02-11-2019

Click Accept to close this question and ask another one.

chrisyounger · ‎02-11-2019

HI tb5821. You should accept the answer to this question and create a new question with the relevant details. That way we can help you better 🙂

MuS · ‎02-11-2019

Hi tb5821,

try these settings in the advanced settings of the add data UI:

[ __auto__learned__ ]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
BREAK_ONLY_BEFORE=geo \{

I reckon the ^ in your regex does not work.

cheers, MuS

tb5821 · ‎02-11-2019

thanks seems to work better but now I'm getting an exceeded 256 lines for some of my messages... is there an advanced setting to increase that limit?

MuS · ‎02-11-2019

yep, you can use

MAX_EVENTS = <integer>
* Specifies the maximum number of input lines to add to any event.
* Splunk software breaks after the specified number of lines are read.
* Defaults to 256 (lines).

if you are sure that this is correct and the one event is over 256 lines.

Can you help me fix my regex to event break a multiline file?

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!