Getting Data In

BREAK_ONLY_BEFORE failed, setting TIME_FORMAT solved the problem

yahooku
Explorer

Hi, so I've been trying to split falsely merged (separate) events:

10:42:08  Checkpoint Completed:  duration was 0 seconds.
10:42:08  Checkpoint loguniq 4227, logpos 0x4ca7018, timestamp: 0x7f8d03be
10:42:08  Maximum server connections 1414 

An obvious thing to do is to use BREAK_ONLY_BEFORE attribute - or is it? So here's what I tried in /local/props.conf

[host::some_host_name]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = ^/d/d:/d/d:/d/d

Surprisingly this didn't work. Needless to say I've tried countles variations of BREAK_ONLY_BEFORE and tried othe attributes. Finally I tried the TIME_FORMAT attribute:

[host::some_host_name]
SHOULD_LINEMERGE = True
TIME_FORMAT = %H:%M:%S

...and it worked like a charm. Can someone explain why this worked while the latter didn't? And how should the proper BRAK\ONLY_BRFORE atrribute look like for this to work? I didn't find anything satysfying on the forums.

0 Karma
1 Solution

kristian_kolb
Ultra Champion

Well,

First of all, I would not recommend you to use [host::your_host] configuration stanzas in your props.conf file, since the rules would then apply to all events coming from this host, regardless of the format of the event/timestamp. It's much more logical to use the [your_sourcetype] style of configuration, since rules are then applied based the type of data coming in, rather than from where it originated.

Secondly, why use SHOULD_LINEMERGE=true, if the events are single-line? This may be one of the reasons for your problems - Splunk tries to find a full timestamp (including date), and has to merge several lines to find some characters it think fits.

Thirdly, though this may seem a bit redundant, is that your regex for BREAK_ONLY_BEFORE have forward slashes rather than backslashes.

My suggestion is that you use the following instead;

[your_sourcetype]
SHOULD_LINEMERGE=false
MAX_TIMESTAMP_LOOKAHEAD=8
TIME_FORMAT=%H:%M:%S

UPDATE:

if you also have multiline messages, then you could/should still not line_merge;

[your_sourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d\d:\d\d:\d\d\s
MAX_TIMESTAMP_LOOKAHEAD = 8
TIME_FORMAT = %H:%M:%S

Hope this helps,

Kristian

View solution in original post

kristian_kolb
Ultra Champion

Well,

First of all, I would not recommend you to use [host::your_host] configuration stanzas in your props.conf file, since the rules would then apply to all events coming from this host, regardless of the format of the event/timestamp. It's much more logical to use the [your_sourcetype] style of configuration, since rules are then applied based the type of data coming in, rather than from where it originated.

Secondly, why use SHOULD_LINEMERGE=true, if the events are single-line? This may be one of the reasons for your problems - Splunk tries to find a full timestamp (including date), and has to merge several lines to find some characters it think fits.

Thirdly, though this may seem a bit redundant, is that your regex for BREAK_ONLY_BEFORE have forward slashes rather than backslashes.

My suggestion is that you use the following instead;

[your_sourcetype]
SHOULD_LINEMERGE=false
MAX_TIMESTAMP_LOOKAHEAD=8
TIME_FORMAT=%H:%M:%S

UPDATE:

if you also have multiline messages, then you could/should still not line_merge;

[your_sourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\d\d:\d\d:\d\d\s
MAX_TIMESTAMP_LOOKAHEAD = 8
TIME_FORMAT = %H:%M:%S

Hope this helps,

Kristian

yahooku
Explorer

Ok, thanks for clearing this out.

0 Karma

kristian_kolb
Ultra Champion

Updated answer above. And while Ayn has a point, it is still a fact that LINE_BREAKER is more efficient than the combination of SHOULD_LINEMERGE and BREAK_ONLY... directives.

/k

Ayn
Legend

As Splunk by default breaks events when it encounters a valid timestamp (as defined by the BREAK_ONLY_BEFORE_DATE configuration parameter), improper line breaking is very often a symptom of improper timestamp parsing. So, configuring timestamp parsing correctly is a much better option than messing with other breaking directives - you get valid timestamps AND valid event breaking.

yahooku
Explorer

Thanks for a quick answear. You are right about replacing host by source identifier - I have only one source from this host, but still this is not a good thing to do.

About the SHOULD_LINEMERGE. Not all events are single-lined, only those which were merged together.

And about the regex. Sorry for this mistake, just coppied a result of a some desparate attempt to make this work. I'm sure I also tried the right regex - I checked it in a text editor when the BREAK_ONLY_BEFORE dind't seem to work.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

🍂 Fall into November with a fresh lineup of Community Office Hours, Tech Talks, and Webinars we’ve ...

Transform your security operations with Splunk Enterprise Security

Hi Splunk Community, Splunk Platform has set a great foundation for your security operations. With the ...

Splunk Admins and App Developers | Earn a $35 gift card!

Splunk, in collaboration with ESG (Enterprise Strategy Group) by TechTarget, is excited to announce a ...