Splunk Search

Trouble reading log lines with large JSON or multiline Java exceptions from slf4j

mriley_cpmi
Explorer

My question is similar to others around extracting new fields, but the answers I've tried to date haven't worked.

When I click on Extract New Fields, the Select Sample Event screen will end up selecting somewhere around 20 actual log lines. It will read them as a single sample event instead of around 20 separate events (one per log line).

Included in the log message at the end of the log line are sometimes very large JSON strings or typical multi-line Java exceptions.

The log pattern is as follows:

time:stamp LOGTYPE  [java-thread-id-1234][JavaClass:LineNumber] Log message goes here. Usually is a short message. Sometimes includes *very* large single-line JSON strings. Sometimes includes a multi-line Java exception.

A practical example would be as follows:

08:33:09,372 INFO  [http-bio-8080-exec-4687][ServicesController:125] JSON returned={"succeeded":true,"data":{"example1":[],"example2":"","example3":null},"message":""}

Or:

09:47:13,215 INFO  [http-bio-8080-exec-4678][ServicesController:125] Example log message goes here.

When I setup the forwarder, the Source Type was set to Automatic, not log4j. We're using slf4j for our logger. Does Splunk understand slf4j? I'm assuming it does, but if it doesn't, do I need to find an app that will add support for slf4j?

Bottom line, is it possible to extract these fields including the large JSON strings and multi-line Java exceptions?

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

Try props.conf settings something like this:

[your_sourcetype]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_FORMAT = %H:%M:%S,%3N
TRUNCATE = set this large enough to fit your biggest events in characters plus safety margin
MAX_EVENTS = set this large enough to fit your biggest events in lines plus safety margin
EXTRACT-slkf4j = (?s)^\S++\s++(?<log_level>[A-Z]++)[^\[]*+\[\s*+(?<thread_id>[^\]\s]++)\s*+\]\[\s*+(?<java_class>[^:]++):(?<line_number>\d++)\s*+\]\s*+(?<message>.*+)
EXTRACT-json_message = (?s)JSON\s*+returned=(?<json_message>.*+)

This should get your timestamping and event breaking in order, as well as basic field extractions. The JSON part is a bit more tricky, I think Splunk doesn't like partial-JSON-events for INDEXED_EXTRACTIONS = json... if that's true, you can always do base search | spath input=json_message

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

Try props.conf settings something like this:

[your_sourcetype]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_FORMAT = %H:%M:%S,%3N
TRUNCATE = set this large enough to fit your biggest events in characters plus safety margin
MAX_EVENTS = set this large enough to fit your biggest events in lines plus safety margin
EXTRACT-slkf4j = (?s)^\S++\s++(?<log_level>[A-Z]++)[^\[]*+\[\s*+(?<thread_id>[^\]\s]++)\s*+\]\[\s*+(?<java_class>[^:]++):(?<line_number>\d++)\s*+\]\s*+(?<message>.*+)
EXTRACT-json_message = (?s)JSON\s*+returned=(?<json_message>.*+)

This should get your timestamping and event breaking in order, as well as basic field extractions. The JSON part is a bit more tricky, I think Splunk doesn't like partial-JSON-events for INDEXED_EXTRACTIONS = json... if that's true, you can always do base search | spath input=json_message

mriley_cpmi
Explorer

I have created the new source type in my splunk/etc/system/local/props.conf file, applied it to my data input and restarted the Splunk service. Do I need to do anything to the already indexed data so it uses the new source type?

0 Karma

mriley_cpmi
Explorer

After clearing out my affected indexes and re-importing my log data, I was able to cleanly extract all my target fields. Thanks!

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...