MUST_BREAK_AFTER
and time_before_close
?We are using Splunk 6.2 by the way.
We are writing application (.NET Framework) logs to a file on a Windows system. Occasionally some events are split at seemingly random locations. When importing the log file containing a split event on another Splunk instance, the event is not split. Neither the log file size, the location of the events in the log file nor the size of the events are consistent. Further investigation lead me to this question: 1. The problem explained there seems to be similar, tough in our case events are not split periodically and we do not use/set the time_before_close
parameter. Basically, due to buffering etc. it is possible that only a part of a event is written to the file before remaining data is added, maybe causing the problem.
So I created a simple test setup to check the behavior of Splunk when writing partial events to a file:
inputs.conf
props.conf*
[monitor://C:\Logs]
disabled = false
host = MyHost
index = test
whitelist = ^..log$
sourcetype = mySourcetype
Then I created the file with the following content (a partial event):
[mySourcetype]
CHARSET = CP1252
BREAK_ONLY_BEFORE_DATE = false
BREAK_ONLY_BEFORE = [Start]
MAX_EVENTS = 1000000000
TIME_PREFIX = ([\r\n]|\s)timestamp="
MAX_TIMESTAMP_LOOKAHEAD = 64
C:/Logs/MyLog.log
while running a real-time search. After a few seconds a new event was reported. So I assume because the file has not been written to for 3 seconds (
[Start]
Test
time_before_close
defaults to 3), Splunk determines this must be a complete event (even tough the documentation for BREAK_ONLY_BEFORE
states that
Splunk creates a new event only if it encounters a new line that matches the regular expression 2.There is no new line matching the regular expression at the end of the file, not even a new line character, only EOF.
time_before_close
parameter could fix the problem. Except that the last event would be delayed for the specified time, as there won't be a new event after the last one. Therefore we will also need to use MUST_BREAK_AFTER
to ensure the last event can be reported immediately:
inputs.conf
props.conf*
[monitor://D:\LogFiles\Test]
disabled = false
host = MyHost
index = test
whitelist = ^..log$
sourcetype = mySourcetype
time_before_close = 300
[mySourcetype]
CHARSET = CP1252
BREAK_ONLY_BEFORE_DATE = false
BREAK_ONLY_BEFORE = [Start]
MUST_BREAK_AFTER = [End]
MAX_EVENTS = 1000000000
TIME_PREFIX = ([\r\n]|\s)timestamp="
MAX_TIMESTAMP_LOOKAHEAD = 64
There is an excellent wiki that tells more than most people can grok:
http://wiki.splunk.com/Community:HowIndexingWorks
There is an excellent wiki that tells more than most people can grok:
http://wiki.splunk.com/Community:HowIndexingWorks
Thanks a lot - that's exactly what I was looking for.