My props.conf has:
[server]
MAX_TIMESTAMP_LOOKAHEAD = 0
SHOULD_LINEMERGE = true
#BREAK_ONLY_BEFORE_DATE = true
BREAK_ONLY_BEFORE = ^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}[,.]\d{3}
TIME_FORMAT = %Y-%m-%d %H:%M:%S[,.]%3N
My events suffering this problem have
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:32.990: <java method> WARNING <warning message>
This is presented in Splunk as one event, not five.
While these events are single line, other events in the same log might be multiline.
Why aren't these logs being processed correctly - I'd expect the BREAK_ONLY_BEFORE to cause an event per line.
It looks like your TIME_FORMAT
is part of the issue here. Unix strptime
(and apparently the splunk-enhanced strptime
) do not support regex style syntax. Your TIME_FORMAT
will have to be either one or the other of:
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
Also, setting MAX_TIMESTAMP_LOOKAHEAD
could have negative performance impacts. See http://www.splunk.com/base/Documentation/4.2.1/Data/Configuretimestamprecognition, specifically the comment
If set to 0 or -1, the length
constraint for timestamp recognition
is effectively disabled. This can have
negative performance implications
which scale with the length of input
lines (or with event size when
LINE_BREAKER is redefined for event
splitting).
I used this config, and was able to parse your data into distinct multi-line events:
[sa22432]
MAX_TIMESTAMP_LOOKAHEAD = 24
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
[monitor:///tmp/sa22432/]
sourcetype=sa22432
It looks like your TIME_FORMAT
is part of the issue here. Unix strptime
(and apparently the splunk-enhanced strptime
) do not support regex style syntax. Your TIME_FORMAT
will have to be either one or the other of:
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
Also, setting MAX_TIMESTAMP_LOOKAHEAD
could have negative performance impacts. See http://www.splunk.com/base/Documentation/4.2.1/Data/Configuretimestamprecognition, specifically the comment
If set to 0 or -1, the length
constraint for timestamp recognition
is effectively disabled. This can have
negative performance implications
which scale with the length of input
lines (or with event size when
LINE_BREAKER is redefined for event
splitting).
I used this config, and was able to parse your data into distinct multi-line events:
[sa22432]
MAX_TIMESTAMP_LOOKAHEAD = 24
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
[monitor:///tmp/sa22432/]
sourcetype=sa22432
MAX_TIMESTAMP_LOOKAHEAD gives the last char position before which the timestamp have to be found, not it's starting point.
Ok, I thought I was being performant with MAX_TIMESTAMP_LOOKAHEAD - i.e. as I'm only interested in a timestamp at the start of the line, I need look no further, so zero should be fine, but I guess I got that wrong!
Changing that value seems to have worked.
Did you try ?
SHOULD_LINEMERGE = false
What is producing these logs? If I had to guess I would say Weblogic?
Also, what problems did you encounter which led you to define custom handling of this log data?
If this is weblogic, I've had pretty good luck by just defining the log's sourcetype as "log4j".
Here's the list of pretrained source types for Splunk 4.1.6 (maybe also look at weblogic_stdout ?)
http://www.splunk.com/base/Documentation/4.1.6/Admin/Listofpretrainedsourcetypes
SHOULD_LINEMERGE = false
I believe means don't merge lines. However there is a dependency on the sourcetype.
So, for example, when I have log4j set for Tomcat web app. logs. I get distinct timestamp lines as a single event unless the timestamp line is followed by something that is not a timestamp such as a java stacktrace. In those cases, the timestamp line and all following lines up to but not including the next timestamp is bundled into a single event.
It's jboss rather than weblogic but you're right, it's still log4j. Might investigate what the default log4j sourcetype actually looks like.
Perhaps I misunderstand SHOULD_LINEMERGE - but I do want to match multiline events in the same log file, it's just that these log lines aren't multiline, but neither is Splunk picking that up.