Running Splunk Enterprise 6.5.6.
I am parsing incoming events of sourcetype weblogic_stdout, and am having some trouble with the LineBreakingProcessor when Truncation on large events. In my case the events that are problematic come in pairs having nearly the same timestamp, with the first containing a Java stack trace, and the second containing a unique error identifier along with a different Java stack trace. Given the size of the stack trace, the first event is truncated, but LineBreakingProcessor doesn't seem to see the second event at all. The event after this one, however, is indexed by Splunk.
The first thought I had was that the second event got truncated out with the first, but the "line length >=" corresponded to the number of characters in the first event, so it seems the second event is completely ignored/dropped.
I searched the splunkd.log to find the longest line length, and tried setting TRUNCATE to something a little higher than that, but when I restarted the Search Peer, I found that the largest line length had approximately doubled, which I found very puzzling.
It's also possible that my LINE_BREAKER regex isn't working as designed, but I've been through it several times (and now have some additional eyes looking at it. The props.conf contains the following active lines:
[weblogic_stdout]
DATETIME_CONFIG = /etc/apps/spste/weblogic_stdout.xml
===> The following works to extract the dates while leaving the text in the event
LINE_BREAKER = ([\r\n]+)([?\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\,\d{3}]?\s|\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\s|#{0,4}<\w{3}\s\d{1,2}\,\s\d{4}\s\d{1,2}:\d{2}:\d{2}\s[AP]M\s\w{3,}>\s|\w{3}\s\d{1,2}\,\s\d{4}\s\d{1,2}:\d{2}:\d{2}\s[AP]M\s|\d{2}:\d{2}:\d{2}[.,]\d{3}\s|[DEBUG]\s\d{8}\s\d{2}:\d{2}:\d{2}\,\d{3}\s|[INFO]\s\d{8}\s\d{2}:\d{2}:\d{2}\,\d{3}\s|[ERROR]\s\d{8}\s\d{2}:\d{2}:\d{2}\,\d{3}\s|[WARN]\s\d{8}\s\d{2}:\d{2}:\d{2}\,\d{3}\s)
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TRUNCATE = 80000
The regex alternate pattern that should match is the one with [ERROR] in it. An example start line is:
[ERROR] 20171205 09:58:35.277 [ other stuff...]
The weblogic_stdout.xml file defines the matching pattern, but shouldn't come into play, since the XML file is only for date extraction, right?
... View more