We are writing out to a log for which splunk is indexing for most lines okay, but some times splunk indexes before the line has finished writing.
This is due to the process in the way the log line is generated. Is there a way to tell splunk to not index the line until the next line is seen?
Hi,
I know that this is the old question, but it would have saved my day if it was answered before )
The answer is to add time_before_close=60 (or another integer) into inputs.conf and all events will index correctly!
https://answers.splunk.com/answers/103132/events-are-broken-in-the-middle-of-the-line.html
https://answers.splunk.com/answers/492950/the-app-is-indexing-event-before-the-tmg-has-write.html
We are using Jmeter and it starts writing out the line, then adds some more and so on until the line is complete. Splunk indexes it partially.
Just thought, there might be a way to stop splunk indexing the line until it sees the start of the next line, say the date.
Have you tried MUST_NOT_BREAK_BEFORE set to a newline or carriage return or both?
How is the Jmeter input configured? Are you sure that there is no "backslash_r" or "backslash_n" hidden in the slow log line? See http://docs.splunk.com/Documentation/Splunk/6.2.0/Data/Indexmulti-lineevents for info on event breaking.
If you are indexing a log file that is tied to a process that is sending buffered output, then you will always have a problem (from my experience). I had a couple of processes that did that, and I had to force the output of the complete buffer, even if it was only partly full. One example was a Curl program that collected output and put it into a file that Splunk indexed. The curl invocation had to be done with the flag that told it not to buffer the output. If I didn't do this, then it would split lines all over the place as it wrote out 4096 bytes at a time.
Is this the type of thing you are seeing?