I have cluster with 2 indexers, RF=2 running Splunk version 7.1.2 on Windows Server 2012.
I often get following error:
Indexer Clustering: too many bucket replication errors to target peer.
In splunkd.log on both indexers I found similar errors:
ERROR TimeInvertedIndex - Failed to rename from="C:\Splunk\var\lib\splunk\audit\db\hot_v1_278\.rawSize_tmp" to="C:\Splunk\var\lib\splunk\audit\db\hot_v1_278\.rawSize": Access is denied.
ERROR LMApplyResponse - failed to rename C:\Splunk\var\lib\splunk\fishbucket\rawdata\1322324208-C:\Splunk\var\lib\splunk\fishbucket\rawdata\1322324208.old [1,1,1] (Access is denied.)
ERROR HotBucketRoller - Unable to rename from='C:\Splunk\var\lib\splunk\_internaldb\db\hot_v1_572' to='C:\Splunk\var\lib\splunk\_internaldb\db\db_1570774736_1570503894_572_F0C749EE-B861-4598-B107-5358365E79D8' because The system cannot find the file specified.
and so on.
AND NO BUCKETS IN FIXUP STATE.
Splunk is installed under local OS Admin (no AD).
I checked file permissions for fishbucket\rawdata*.old files and found that no user or group has any access to it, even SYSTEM!
My steps:
1. Removed fishbucket\rawdata\1322324208.old file
2. Executed icacls.exe commands in order to fix Splunk directory permissions:
icacls.exe "C:\Splunk" /inheritance:e /T
icacls.exe "C:\Splunk" /T /Q /reset
3. Initiated rolling restart on Master Node. Indexers entered maintenance mode and restarted.
This caused a huge number of fixup tasks after restarts succeeded and fixed all issues for some time, but today I got same issues with other buckets.
Why Splunk still creates files with bad access rights? What I can do in this situation? Maybe reinstall Splunk?
Note: I know that running Splunk on Windows is pain and I can't move to Linux servers. I have 2 other clusters with indexers running on Windows and they do well.