I just solved my own variation of this problem...
Key discovery: empirically, at least in v6.5.0, the helpful behavior of time_before_close applies only to the tailing processor and not to the batch reader. (see also https://answers.splunk.com/answers/109779/when-is-the-batchreader-used-and-when-is-the-tailingprocessor-used.html)
During peak periods, one particular log file of mine grows so quickly that my UF would decide to read it using the batch reader, and each hour I saw hundreds of log messages from that file split into two Splunk events each, along with contemporaneous "INFO TailReader - Batch input finished reading file" messages in splunkd.log.
I don't want rsyslog working extra hard to write individual syslog messages to the filesystem atomically, so my solution was to stop using the batch reader:
# Set very high so we never use the batch reader.
min_batch_size_bytes = 1000000000000
# Set higher than the number of simultaneously active log files to
# avoid splitting events.
max_fd = 10000
... View more
Its best way to keep the field names on the top of the csv. Splunk will pick automatically the as field_name.
settings -> Add Data, Monitor--> files and dir( file.csv)
After adding the data you can see in the props.conf -
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
You can directly search with the field names [ _time,IP,lOC ]
... View more