I am trying to solve a problem where a particular JSON data feed/source has intermittent line break failures. In a 24 hour period, there are about 100K events parsed correctly (i.e., the line breaks are applied to the correct location) and about 40K events are parsed incorrectly (i.e., the 40K source/application events appear as 170 clumped events in Splunk). Hopefully that makes sense. My question: What are possible causes for this behavior?
This Q&A mentions an open support case that may be a genuine bug.
https://answers.splunk.com/answers/374017/why-are-events-getting-merged-around-midnight-afte.html
Other than that, the problem is usually that the user has redefined LINE_BREAKER
and has SHOULD_LINEMERGE = true
(<-incorrect) instead of SHOULD_LINEMERGE = false
(<-correct).
This Q&A mentions an open support case that may be a genuine bug.
https://answers.splunk.com/answers/374017/why-are-events-getting-merged-around-midnight-afte.html
Other than that, the problem is usually that the user has redefined LINE_BREAKER
and has SHOULD_LINEMERGE = true
(<-incorrect) instead of SHOULD_LINEMERGE = false
(<-correct).
We set SHOULD_LINEMERGE = "false" and that resolved the issue. Thanks!
Check for errors/warning in the _internal logs on Indexers for the sourcetype you're seeing issue.
index=_internal sourcetype=splunkd component=LineBreakProcessor log_level=error OR log_level=warn
Thanks for this troubleshooting step.