When searching through logs generated by our java application server, we have noticed a new behavior that did not previously exists ( potentially caused by recent upgrade ). In February, we have examples of java exceptions that are 100+ lines long, but are correctly being indexed in Splunk as a single event. Now, in that same log, java exceptions are being indexed as 100 separate lines, which makes it very difficult to correctly analyze the data. Is there a way to modify our Splunk config so that it correctly groups these log entries?
is it possible to BREAK_ONLY_BEFORE_DATE and MAX_EVENTS =
I'm looking for a way to truncate jboss error logs (full of java exceptions).
Is there a way to re-break old raw data into events again after I modified props.conf. Or should I live forever with that data which was broken into events the wrong way?
You need to adjust the line breaking behavior in props.conf. You'll want Line merge to be enabled and may need to increase the maximum event count.
.. doc citation follows ...
http://www.splunk.com/base/Documentation/Latest/Admin/Propsconf
LINE_BREAKER =
* Specifies a regex that determines how the raw text stream is broken into initial events,
before line merging takes place. (See the SHOULD_LINEMERGE attribute, below)
* Defaults to ([\r\n]+), meaning data is broken into an event for each line, delimited by \r
or \n.
* The regex must contain a matching group.
* Wherever the regex matches, Splunk considers the start of the first matching group to be the
end of the previous event, and considers the end of the first matching group to be the start
of the next event.
* The contents of the first matching group are ignored as event text.
* NOTE: You get a significant boost to processing speed when you use LINE_BREAKER to delimit
multiline events (as opposed to using SHOULD_LINEMERGE to reassemble individual lines into
multiline events).
LINE_BREAKER_LOOKBEHIND =
* When there is leftover data from a previous raw chunk, LINE_BREAKER_LOOKBEHIND indicates the
number of characters before the end of the raw chunk (with the next chunk concatenated) that
Splunk applies the LINE_BREAKER regex. You may want to increase this value from its default
if you are dealing with especially large or multiline events.
* Defaults to 100 (characters).
SHOULD_LINEMERGE = [true|false]
* When set to true, Splunk combines several lines of data into a single multiline event, based
on the following configuration attributes.
* Defaults to true.
BREAK_ONLY_BEFORE_DATE = [true|false]
* When set to true, Splunk creates a new event only if it encounters a new line with a date.
* Defaults to true.
BREAK_ONLY_BEFORE =
* When set, Splunk creates a new event only if it encounters a new line that matches the
regular expression.
* Defaults to empty.
MUST_BREAK_AFTER =
* When set and the regular expression matches the current line, Splunk creates a new event for
the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.
MUST_NOT_BREAK_AFTER =
* When set and the current line matches the regular expression, Splunk does not break on any
subsequent lines until the MUST_BREAK_AFTER expression matches.
* Defaults to empty.
MUST_NOT_BREAK_BEFORE =
* When set and the current line matches the regular expression, Splunk does not break the
last event before the current line.
* Defaults to empty.
MAX_EVENTS =
* Specifies the maximum number of input lines to add to any event.
* Splunk breaks after the specified number of lines are read.
* Defaults to 256 (lines).