Splunk is monitoring access log file using the stanza below
[monitor:///opt/logging/prodops_httpd]
blacklist = (\.snapshot|\.gz$)
disabled = 0
followTail = 0
host_regex = /opt/logging/prodops_httpd/(.*)/.*\.log
whitelist = (access|error)\.log$
The log files are rotated every night using the logrotae script like below
/etc/logrotate.d/httpd:
/etc/httpd/logs/*log {
missingok
notifempty
sharedscripts
postrotate
/sbin/service httpd reload > /dev/null 2>/dev/null || true
/usr/local/bin/httpdlogrotate.sh
endscript
}
/usr/local/bin/httpdlogrotate.sh:
#!/bin/bash
LOGDIR=/etc/httpd/logs
LogDate=$(date +%Y-%m-%d)
for i in $(find $LOGDIR -name "*log.1")
do
FILENAME=$(echo $i|awk -F \/ '{print $NF}' | sed 's/\.1$//') mv $i $LOGDIR/archive/$FILENAME.$LogDate
gzip $LOGDIR/archive/$FILENAME.$LogDate
done
find $LOGDIR/archive/ -type f -name "*gz" -ctime +4 -exec /bin/rm {} \;
Intermittently log file are indexed multiple times.
The splunkd.log file shows error messages like below
02-25-2014 09:42:29.085 -0500 WARN TailingProcessor - Access error while handling path: Failed to get file size from prevFd for fstate where file='/opt/logging/prodops_httpd/ny-web-02.na.rtdom.net/access.log'
The above error messages indicate that Splunk tried to 'fstat' a file and it failed. Which means, Splunk opened a file, and it is still open, and now it can't find out how big it is. That's not normal.
You would tend to expect this represents a platform-level problem, like an NFS problem or a file system corruption problem or a kernel bug. For the NFS , we do not recommend softmount.
Issue got resolved upon install of new index.