I have cluster with 2 indexers, RF=2 running Splunk version 7.1.2 on Windows Server 2012.
I often get following error:
Indexer Clustering: too many bucket replication errors to target peer.
In splunkd.log on both indexers I found similar errors:
ERROR TimeInvertedIndex - Failed to rename from="C:\Splunk\var\lib\splunk\audit\db\hot_v1_278\.rawSize_tmp" to="C:\Splunk\var\lib\splunk\audit\db\hot_v1_278\.rawSize": Access is denied.
ERROR LMApplyResponse - failed to rename C:\Splunk\var\lib\splunk\fishbucket\rawdata\1322324208-C:\Splunk\var\lib\splunk\fishbucket\rawdata\1322324208.old [1,1,1] (Access is denied.)
ERROR HotBucketRoller - Unable to rename from='C:\Splunk\var\lib\splunk\_internaldb\db\hot_v1_572' to='C:\Splunk\var\lib\splunk\_internaldb\db\db_1570774736_1570503894_572_F0C749EE-B861-4598-B107-5358365E79D8' because The system cannot find the file specified.
and so on.
AND NO BUCKETS IN FIXUP STATE.
Splunk is installed under local OS Admin (no AD).
I checked file permissions for fishbucket\rawdata*.old files and found that no user or group has any access to it, even SYSTEM!
1. Removed fishbucket\rawdata\1322324208.old file
2. Executed icacls.exe commands in order to fix Splunk directory permissions:
icacls.exe "C:\Splunk" /inheritance:e /T
icacls.exe "C:\Splunk" /T /Q /reset
3. Initiated rolling restart on Master Node. Indexers entered maintenance mode and restarted.
This caused a huge number of fixup tasks after restarts succeeded and fixed all issues for some time, but today I got same issues with other buckets.
Why Splunk still creates files with bad access rights? What I can do in this situation? Maybe reinstall Splunk?
Note: I know that running Splunk on Windows is pain and I can't move to Linux servers. I have 2 other clusters with indexers running on Windows and they do well.