I have a monitored file input for a .tsv file that gets updated via a SQL query every hour. However, the data is only showing up in the index periodically (haven't been able to determine the frequency, but it isn't hourly like it should be). If I restart the forwarder I see the TailingProcessor add a watch, but the file subsequently gets handled by the BatchReader as shown in the log snippet below. For other files inputs using the [monitor://...] stanza I don't see any log entries related to the BatchReader, any ideas why this one is being treated any differently? Universal forwarder is version 6.2.1.
# grep metrics5.tsv /opt/splunkforwarder/var/log/splunk/splunkd.log
04-27-2015 09:46:13.653 -0400 INFO TailingProcessor - Parsing configuration stanza: monitor:///data/log/hadoop_job_metrics/metrics5.tsv.
04-27-2015 09:46:13.653 -0400 INFO TailingProcessor - Adding watch on path: /data/log/hadoop_job_metrics/metrics5.tsv.
04-27-2015 09:46:13.660 -0400 INFO BatchReader - Removed from queue file='/data/log/hadoop_job_metrics/metrics5.tsv'.
04-27-2015 10:01:47.734 -0400 INFO BatchReader - Removed from queue file='/data/log/hadoop_job_metrics/metrics5.tsv'.
The API also indicates it is being read in batch mode:
https://localhost:8089/services/admin/inputstatus/TailingProcessor%3AFileStatus
/data/log/hadoop_job_metrics/metrics5.tsv
file position 23783042
file size 23783042
percent 100.00
type done reading (batch)
inputs.conf:
[monitor:///data/log/hadoop_job_metrics/metrics5.tsv]
disabled = false
sourcetype = hadoop_job_metrics_v2
index = main
crcSalt = <SOURCE>
props.conf:
[hadoop_job_metrics_v2]
FIELD_DELIMITER = tab
FIELD_NAMES = JOB_ID,JOB_STATUS,JOB_FAILED_MAP_ATTEMPTS,JOB_FAILED_REDUCE_ATTEMPTS,JOB_FILE_BYTES_WRITTEN,JOB_FINISHED_MAP_TASKS,JOB_FINISHED_REDUCE_TASKS,JOB_PRIORITY,JOB_TOTAL_LAUNCHED_MAPS,JOB_TOTAL_LAUNCED_REDUCES,JOB_CPU_MILLISECONDS,MAP_CPU_MILLISECONDS,RED_CPU_MILLISECONDS,JOB_MAPRFS_BYTES_READ,MAP_MAPRFS_BYTES_READ,RED_MAPRFS_BYTES_READ,JOB_MAPRFS_BYTES_WRITTEN,MAP_MAPRFS_BYTES_WRITTEN,RED_MAPRFS_BYTES_WRITTEN,JOB_PHYSICAL_MEMORY_BYTES,MAP_PHYSICAL_MEMORY_BYTES,RED_PHYSICAL_MEMORY_BYTES,JOB_VIRTUAL_MEMORY_BYTES,MAP_VIRTUAL_MEMORY_BYTES,RED_VIRTUAL_MEMORY_BYTES,JOB_NAME,PARENT_JOB_ID,USER_SUBMITTED,TIME_SUBMITTED,TIME_STARTED,TIME_FINISHED,CLUSTER_ID,CREATED
HEADER_FIELD_DELIMITER = tab
INDEXED_EXTRACTIONS = tsv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = CREATED
category = Structured
description = Tab-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
... View more