Solved: Help parsing and breaking up multi-line event in p...

a212830 · ‎08-02-2014

Hi,

I'm having problems parsing the following lines, hoping someone can help me.

Here's my props:

ANNOTATE_PUNCT = false
KV_MODE = auto
LINE_BREAKER = ([\r\n]+)WARN|INFO|ERROR|DEBUG\d{4}-\d{2}-\d{2}
MAX_TIMESTAMP_LOOKAHEAD = 50
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
TIME_PREFIX = ^WARN| |^INFO |^ERROR 
TRUNCATE = 999999

Sample:

WARN | 2014-06-19 20:37:30,275 | localhost-startStop-1 | TypeConverterDelegate.java | 263 | PropertyEditor [com.sun.beans.editors.EnumEditor] found through deprecated global PropertyEditorManager fallback - consider using a more isolated form of registration, e.g. on the BeanWrapper/BeanFactory!
 WARN | 2014-06-19 20:37:30,285 | localhost-startStop-1 | TypeConverterDelegate.java | 263 | PropertyEditor [com.sun.beans.editors.EnumEditor] found through deprecated global PropertyEditorManager fallback - consider using a more isolated form of registration, e.g. on the BeanWrapper/BeanFactory!
 WARN | 2014-06-19 20:37:30,293 | localhost-startStop-1 | TypeConverterDelegate.java | 263 | PropertyEditor [com.sun.beans.editors.EnumEditor] found through deprecated global PropertyEditorManager fallback - consider using a more isolated form of registration, e.g. on the BeanWrapper/BeanFactory!
 WARN | 2014-06-19 20:37:30,300 | localhost-startStop-1 | TypeConverterDelegate.java | 263 | PropertyEditor [com.sun.beans.editors.EnumEditor] found through deprecated global PropertyEditorManager fallback - consider using a more isolated form of registration, e.g. on the BeanWrapper/BeanFactory!

martin_mueller · ‎08-02-2014

Well, if you're getting one big event then my note number five is your biggest issue. Your LINE_BREAKER = ([\r\n]+)WARN|INFO|ERROR|DEBUG\d{4}-\d{2}-\d{2} doesn't match the sample data. Try this expression based on note number two instead if you don't want to rely on "break on timestamp":

LINE_BREAKER = ([\r\n]+)\s*[A-Z]+\s*\|\s*\d{4}-\d{2}-\d{2}

That allows for any number of spaces before and after the log level, any capital-letter log level, as well as the pipe symbol and spaces before the date.

View solution in original post

martin_mueller · ‎08-02-2014

Well, if you're getting one big event then my note number five is your biggest issue. Your LINE_BREAKER = ([\r\n]+)WARN|INFO|ERROR|DEBUG\d{4}-\d{2}-\d{2} doesn't match the sample data. Try this expression based on note number two instead if you don't want to rely on "break on timestamp":

LINE_BREAKER = ([\r\n]+)\s*[A-Z]+\s*\|\s*\d{4}-\d{2}-\d{2}

That allows for any number of spaces before and after the log level, any capital-letter log level, as well as the pipe symbol and spaces before the date.

a212830 · ‎08-05-2014

Worked. Thanks!

a212830 · ‎08-02-2014

The problem is the line-breaking - sorry, forgot to include that. I'm getting one big event.

a212830 · ‎08-02-2014

Thanks for the response. Yes, some (not all) of the lines appear to have a space at the beginning. There are multi-lines in the file, I can include them if needed. They aren't huge - 2 or 3 lines. As for using the default settings, we have a policy where that's not allowed. Prof services doesn't recommend it, and if something changes in the log, we can't really go back and tell what the settings were beforehand. My understanding is that it also puts extra burden on the indexer.

martin_mueller · ‎08-02-2014

Fifth, your LINE_BREAKER regex doesn't allow for any spaces or pipes between the loglevel and the date. There should be parentheses around the loglevel list as well because the pipe symbol (OR) has less strong binding than character concatenation (AND).

martin_mueller · ‎08-02-2014

How are your problems manifesting themselves?

Some stuff I noticed:
First, it appears some of your logs have a space in front of the loglevel. Is that actually the case or just a copy&paste error?
Second, the TIME_PREFIX regex is "starts with warn", "space", "starts with info", or "starts with error" - is that intentional? I'd go with ^\s*[A-Z]+\s*\|\s* instead to be robust against small differences.
Third, your sample data doesn't appear to have any multi-line events but you mentioned those in the title?
Fourth, throwing your sample data into Splunk with all default settings looks okay.

Help parsing and breaking up multi-line event in props.conf

See your relevant APM services, dashboards, and alerts in one place with the updated ...

Splunk App for Anomaly Detection End of Life Announcement

Aligning Observability Costs with Business Value: Practical Strategies

Are you a member of the Splunk Community?

Help parsing and breaking up multi-line event in props.conf

See your relevant APM services, dashboards, and alerts in one place with the updated ...

Splunk App for Anomaly Detection End of Life Announcement

Aligning Observability Costs with Business Value: Practical Strategies