Solved: Re: Why are my events splitting by second and not ...

willthames · ‎04-15-2011

My props.conf has:

[server]
MAX_TIMESTAMP_LOOKAHEAD = 0
SHOULD_LINEMERGE = true
#BREAK_ONLY_BEFORE_DATE = true
BREAK_ONLY_BEFORE = ^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}[,.]\d{3}
TIME_FORMAT = %Y-%m-%d %H:%M:%S[,.]%3N

My events suffering this problem have

2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:32.990: <java method> WARNING <warning message>

This is presented in Splunk as one event, not five.

While these events are single line, other events in the same log might be multiline.

Why aren't these logs being processed correctly - I'd expect the BREAK_ONLY_BEFORE to cause an event per line.

dwaddle · ‎04-19-2011

It looks like your TIME_FORMAT is part of the issue here. Unix strptime (and apparently the splunk-enhanced strptime) do not support regex style syntax. Your TIME_FORMAT will have to be either one or the other of:

TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N

Also, setting MAX_TIMESTAMP_LOOKAHEAD could have negative performance impacts. See http://www.splunk.com/base/Documentation/4.2.1/Data/Configuretimestamprecognition, specifically the comment

If set to 0 or -1, the length
constraint for timestamp recognition
is effectively disabled. This can have
negative performance implications
which scale with the length of input
lines (or with event size when
LINE_BREAKER is redefined for event
splitting).

I used this config, and was able to parse your data into distinct multi-line events:

[sa22432]
MAX_TIMESTAMP_LOOKAHEAD = 24
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N

[monitor:///tmp/sa22432/]
sourcetype=sa22432

View solution in original post

dwaddle · ‎04-19-2011

It looks like your TIME_FORMAT is part of the issue here. Unix strptime (and apparently the splunk-enhanced strptime) do not support regex style syntax. Your TIME_FORMAT will have to be either one or the other of:

TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N

Also, setting MAX_TIMESTAMP_LOOKAHEAD could have negative performance impacts. See http://www.splunk.com/base/Documentation/4.2.1/Data/Configuretimestamprecognition, specifically the comment

If set to 0 or -1, the length
constraint for timestamp recognition
is effectively disabled. This can have
negative performance implications
which scale with the length of input
lines (or with event size when
LINE_BREAKER is redefined for event
splitting).

I used this config, and was able to parse your data into distinct multi-line events:

[sa22432]
MAX_TIMESTAMP_LOOKAHEAD = 24
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N

[monitor:///tmp/sa22432/]
sourcetype=sa22432

Paolo_Prigione · ‎04-19-2011

MAX_TIMESTAMP_LOOKAHEAD gives the last char position before which the timestamp have to be found, not it's starting point.

willthames · ‎04-19-2011

Ok, I thought I was being performant with MAX_TIMESTAMP_LOOKAHEAD - i.e. as I'm only interested in a timestamp at the start of the line, I need look no further, so zero should be fine, but I guess I got that wrong!

Changing that value seems to have worked.

ualbanytech · ‎04-15-2011

Did you try ?
SHOULD_LINEMERGE = false

What is producing these logs? If I had to guess I would say Weblogic?

Also, what problems did you encounter which led you to define custom handling of this log data?

If this is weblogic, I've had pretty good luck by just defining the log's sourcetype as "log4j".

ualbanytech · ‎04-19-2011

Here's the list of pretrained source types for Splunk 4.1.6 (maybe also look at weblogic_stdout ?)

http://www.splunk.com/base/Documentation/4.1.6/Admin/Listofpretrainedsourcetypes

ualbanytech · ‎04-19-2011

SHOULD_LINEMERGE = false

I believe means don't merge lines. However there is a dependency on the sourcetype.

So, for example, when I have log4j set for Tomcat web app. logs. I get distinct timestamp lines as a single event unless the timestamp line is followed by something that is not a timestamp such as a java stacktrace. In those cases, the timestamp line and all following lines up to but not including the next timestamp is bundled into a single event.

willthames · ‎04-19-2011

It's jboss rather than weblogic but you're right, it's still log4j. Might investigate what the default log4j sourcetype actually looks like.

Perhaps I misunderstand SHOULD_LINEMERGE - but I do want to match multiline events in the same log file, it's just that these log lines aren't multiline, but neither is Splunk picking that up.

Why are my events splitting by second and not each timestamp

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

Are you a member of the Splunk Community?

Why are my events splitting by second and not each timestamp

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...