Getting Data In
Highlighted

Splunk not breaking events on line break properly

Engager

Ok, I'm at my wits' end here. I have an application log which produces events of the format:

DEBUG | 2012-02-16 11:01:30,683 [http-10.0.0.1-8443-Processor6] SystemFile  - field1=value1 timestamp=2012-02-16 11:01:30.679 CST   field2=value2   field3=value3   field4=value4   field5= field6=value6   field7=A field value with spaces in it  field8=
DEBUG | 2012-02-16 11:01:32,457 [http-10.0.0.1-8443-Processor10] SystemFile  - field1=value1    timestamp=2012-02-16 11:01:32,450 CST   field2=value2   field3= field4=value4   field5= field6=value6   field7=Another field with spaces in it  field8=value8

Basically tab-delimited name/value pairs, with nice neat newlines at the end of the lines (I've verified the line breaks and tabs in a hex editor, and all events are being written via the same log4j config). I -thought- I had it all being parsed just fine, but it appears that the index-time parsing is not always splitting the events on newlines, and I'll end up with two (or three, or four, or five) log lines in one event. They have different timestamps, so it's not that it's rolling them up into one (the above two events are a sanitzed example of two that got rolled together). I would suspect it's that the first one ends with an equals sign (no value), but there are plenty of events in the same log that look identical that get split properly. I'm stumped.

My props.conf for the log source looks like:

[MySourceType]
LINE_BREAKER = ([\r\n]+)
REPORT-tab-kv-manual = tab-kv-manual
KV_MODE = NONE
TIME_PREFIX = DEBUG
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
MAX_TIMESTAMP_LOOKAHEAD = 30

And my transforms.conf looks like:

[tab-kv-manual]
REGEX = (\t|- )([^=]+)=([^\t\n]*)
FORMAT = $2::$3
REPEAT_MATCH = true

Any suggestions?

0 Karma
Highlighted

Re: Splunk not breaking events on line break properly

Ultra Champion

I've been there as well, and while it looks like your LINE_BREAKER regex is correct, I think I remember that being a bit more explicit solved the issue:

LINE_BREAKER = ([\r\n]+)[A-Z]+\s+\|\s+\d+

Also, your TIME_PREFIX is just wrong, it should be:

TIME_PREFIX = ^[A-Z]+\s+\|\s+

Hope this helps,

Kristian

Highlighted

Re: Splunk not breaking events on line break properly

Builder

Did you ever figure this out? Having the same issue. Testing the explicit line breaker currently.

0 Karma
Highlighted

Re: Splunk not breaking events on line break properly

SplunkTrust
SplunkTrust

What is your data format? Also, include "SHOULDLINEMERGE=false" in props.conf along with LINEBREAKER.