Getting Data In

Mixed test/JSON log events are splitting on a date in the JSON data

Path Finder

I have logs coming in that are either straight text (single line) or text with a JSON string as well.

I have no issues with the straight text, but if there is additional JSON, the event breaks on an attribute with a date.

If the JSON has no additional date, it appears to be OK.

Sample log event with JSON
2018-11-28T11:25:32.876+0000 STDIO [INFO] 2018-11-28 11:25:32 [Thread-3-ESWriterBolt] DEBUG BaseBolt - {
"attribute1": 243,
"attribute2": "Standard",
"attribute3": 2018-11-28T13:11:45.3720",
"attribute4": "Y"
}

Everything up to attribute2 reads fine, however, attribute3 starts a new event, timestamped with the date value there, and going until the end, or until potentially another date field.

The current props.conf for this log type just parses a few fields and also includes TRUNCATE = 0 for no truncation of these events.

What additional to I need to setup in props.conf to make this work?

Thanks!

0 Karma
1 Solution

SplunkTrust
SplunkTrust

You can try with below configuration on Indexer OR Heavy Forwarder whichever comes first from Universal Forwarder.

props.conf

[yoursourcetype]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD=28
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\W\d{4}

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

You can try with below configuration on Indexer OR Heavy Forwarder whichever comes first from Universal Forwarder.

props.conf

[yoursourcetype]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD=28
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\W\d{4}

View solution in original post

0 Karma

Ultra Champion

SHOULD_LINEMERGE must be set to false when you use LINE_BREAKER.

Other than that, this should do the trick. The reason for this behavior: by default Splunk automatically detects timestamps and also assumes that is where it should break up events. Which works fine with single line events, or events that have 1 timestamp, on their first line. But for this type of events you see it doesn't behave as you want it to.

In general it is always better to define a specific LINE_BREAKER and set SHOULD_LINEMERGE to false and define explicit timestamp configuration as well (TIME_PREFIX, TIME_FORMAT, MAX_TIMESTAMP_LOOKAHEAD). This not only improves reliability of parsing, it also greatly improves the performance, as splunk doesn't have to apply all of its auto detection magic.

SplunkTrust
SplunkTrust

Thanks @FrankVI, updated original answer, didn't notice this because I was playing with only one event.

0 Karma

Path Finder

Thanks for both your help. I had tried a LINE_BREAKER previous, but it looks like my REGEX wasn't quite correct. First indications in the development lab is that this is working.

0 Karma

SplunkTrust
SplunkTrust

Hi,

Can you please post your props.conf for above data?

0 Karma

Path Finder

It really isn't much for the log file type:

[storm]
EXTRACT-Storm_Class_MessageType = ^[^ \n]* (?P[^ ]+)\s+[(?P\w+)
TRUNCATE = 0

The extraction is to pull some data out of the text part of the message, which is working fine.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!