Getting Data In

Mixed test/JSON log events are splitting on a date in the JSON data

tlabue
Path Finder

I have logs coming in that are either straight text (single line) or text with a JSON string as well.

I have no issues with the straight text, but if there is additional JSON, the event breaks on an attribute with a date.

If the JSON has no additional date, it appears to be OK.

Sample log event with JSON
2018-11-28T11:25:32.876+0000 STDIO [INFO] 2018-11-28 11:25:32 [Thread-3-ESWriterBolt] DEBUG BaseBolt - {
"attribute1": 243,
"attribute2": "Standard",
"attribute3": 2018-11-28T13:11:45.3720",
"attribute4": "Y"
}

Everything up to attribute2 reads fine, however, attribute3 starts a new event, timestamped with the date value there, and going until the end, or until potentially another date field.

The current props.conf for this log type just parses a few fields and also includes TRUNCATE = 0 for no truncation of these events.

What additional to I need to setup in props.conf to make this work?

Thanks!

0 Karma
1 Solution

harsmarvania57
SplunkTrust
SplunkTrust

You can try with below configuration on Indexer OR Heavy Forwarder whichever comes first from Universal Forwarder.

props.conf

[yoursourcetype]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD=28
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\W\d{4}

View solution in original post

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

You can try with below configuration on Indexer OR Heavy Forwarder whichever comes first from Universal Forwarder.

props.conf

[yoursourcetype]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S.%3N%z
MAX_TIMESTAMP_LOOKAHEAD=28
LINE_BREAKER=([\r\n]+)\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\W\d{4}
0 Karma

FrankVl
Ultra Champion

SHOULD_LINEMERGE must be set to false when you use LINE_BREAKER.

Other than that, this should do the trick. The reason for this behavior: by default Splunk automatically detects timestamps and also assumes that is where it should break up events. Which works fine with single line events, or events that have 1 timestamp, on their first line. But for this type of events you see it doesn't behave as you want it to.

In general it is always better to define a specific LINE_BREAKER and set SHOULD_LINEMERGE to false and define explicit timestamp configuration as well (TIME_PREFIX, TIME_FORMAT, MAX_TIMESTAMP_LOOKAHEAD). This not only improves reliability of parsing, it also greatly improves the performance, as splunk doesn't have to apply all of its auto detection magic.

harsmarvania57
SplunkTrust
SplunkTrust

Thanks @FrankVI, updated original answer, didn't notice this because I was playing with only one event.

0 Karma

tlabue
Path Finder

Thanks for both your help. I had tried a LINE_BREAKER previous, but it looks like my REGEX wasn't quite correct. First indications in the development lab is that this is working.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Can you please post your props.conf for above data?

0 Karma

tlabue
Path Finder

It really isn't much for the log file type:

[storm]
EXTRACT-Storm_Class_MessageType = ^[^ \n]* (?P[^ ]+)\s+[(?P\w+)
TRUNCATE = 0

The extraction is to pull some data out of the text part of the message, which is working fine.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...