Getting Data In

Why are my events splitting by second and not each timestamp

willthames
Path Finder

My props.conf has:

[server]
MAX_TIMESTAMP_LOOKAHEAD = 0
SHOULD_LINEMERGE = true
#BREAK_ONLY_BEFORE_DATE = true
BREAK_ONLY_BEFORE = ^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}[,.]\d{3}
TIME_FORMAT = %Y-%m-%d %H:%M:%S[,.]%3N

My events suffering this problem have

2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:31.966: <java method> WARNING <warning message>
2011-04-15 13:34:32.990: <java method> WARNING <warning message>

This is presented in Splunk as one event, not five.

While these events are single line, other events in the same log might be multiline.

Why aren't these logs being processed correctly - I'd expect the BREAK_ONLY_BEFORE to cause an event per line.

1 Solution

dwaddle
SplunkTrust
SplunkTrust

It looks like your TIME_FORMAT is part of the issue here. Unix strptime (and apparently the splunk-enhanced strptime) do not support regex style syntax. Your TIME_FORMAT will have to be either one or the other of:

TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N

Also, setting MAX_TIMESTAMP_LOOKAHEAD could have negative performance impacts. See http://www.splunk.com/base/Documentation/4.2.1/Data/Configuretimestamprecognition, specifically the comment

If set to 0 or -1, the length
constraint for timestamp recognition
is effectively disabled. This can have
negative performance implications
which scale with the length of input
lines (or with event size when
LINE_BREAKER is redefined for event
splitting).

I used this config, and was able to parse your data into distinct multi-line events:

[sa22432]
MAX_TIMESTAMP_LOOKAHEAD = 24
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N

[monitor:///tmp/sa22432/]
sourcetype=sa22432

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

It looks like your TIME_FORMAT is part of the issue here. Unix strptime (and apparently the splunk-enhanced strptime) do not support regex style syntax. Your TIME_FORMAT will have to be either one or the other of:

TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N

Also, setting MAX_TIMESTAMP_LOOKAHEAD could have negative performance impacts. See http://www.splunk.com/base/Documentation/4.2.1/Data/Configuretimestamprecognition, specifically the comment

If set to 0 or -1, the length
constraint for timestamp recognition
is effectively disabled. This can have
negative performance implications
which scale with the length of input
lines (or with event size when
LINE_BREAKER is redefined for event
splitting).

I used this config, and was able to parse your data into distinct multi-line events:

[sa22432]
MAX_TIMESTAMP_LOOKAHEAD = 24
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TIME_FORMAT = %Y-%m-%d %H:%M:%S.%3N

[monitor:///tmp/sa22432/]
sourcetype=sa22432

Paolo_Prigione
Builder

MAX_TIMESTAMP_LOOKAHEAD gives the last char position before which the timestamp have to be found, not it's starting point.

0 Karma

willthames
Path Finder

Ok, I thought I was being performant with MAX_TIMESTAMP_LOOKAHEAD - i.e. as I'm only interested in a timestamp at the start of the line, I need look no further, so zero should be fine, but I guess I got that wrong!

Changing that value seems to have worked.

0 Karma

ualbanytech
Path Finder

Did you try ?
SHOULD_LINEMERGE = false

What is producing these logs? If I had to guess I would say Weblogic?

Also, what problems did you encounter which led you to define custom handling of this log data?

If this is weblogic, I've had pretty good luck by just defining the log's sourcetype as "log4j".

ualbanytech
Path Finder

Here's the list of pretrained source types for Splunk 4.1.6 (maybe also look at weblogic_stdout ?)

http://www.splunk.com/base/Documentation/4.1.6/Admin/Listofpretrainedsourcetypes

0 Karma

ualbanytech
Path Finder

SHOULD_LINEMERGE = false

I believe means don't merge lines. However there is a dependency on the sourcetype.

So, for example, when I have log4j set for Tomcat web app. logs. I get distinct timestamp lines as a single event unless the timestamp line is followed by something that is not a timestamp such as a java stacktrace. In those cases, the timestamp line and all following lines up to but not including the next timestamp is bundled into a single event.

0 Karma

willthames
Path Finder

It's jboss rather than weblogic but you're right, it's still log4j. Might investigate what the default log4j sourcetype actually looks like.

Perhaps I misunderstand SHOULD_LINEMERGE - but I do want to match multiline events in the same log file, it's just that these log lines aren't multiline, but neither is Splunk picking that up.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...