Hello,
I have created a distributed search environment. On my masternode im getting the error: replication factor and search factor not reached, because of the following errors on my indexers: "Error: slave has more than 10 number of continous replication failures. Look at splunkd.log for details on why this is happening."
So, does everyone know how to solve this problem? I have checked the connectivity between the indexers, and they are available one to each other. What else could it be? Any ideas?
Best regards.
06.12.19 16:03:19,216
12-06-2019 16:03:19.216 +0100 INFO CMRepJob - job=CMReplicationErrorJob bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 failingGuid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 srcGuid=42C2AF4B-A702-44F4-B450-1D0156387DE8 tgtGuid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 succeeded
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO CMSlave - bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 src=42C2AF4B-A702-44F4-B450-1D0156387DE8 tgt=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 failing=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 queued replication error job
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO CMReplicationRegistry - Finished replication: bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 src=42C2AF4B-A702-44F4-B450-1D0156387DE8 target=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 WARN BucketReplicator - Failed to replicate warm bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 to guid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 host=172.28.128.18 s2sport=9887. Connection failed
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - Discarding replication data as QueueRef=guid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 host=172.28.128.18 s2sport=9887 bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 is deleted
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 WARN BucketReplicator - Connection failed
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 ERROR TcpOutputFd - Connection to host=172.28.128.18:9887 failed
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 WARN TcpOutputFd - Connect to 172.28.128.18:9887 failed. No route to host
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - event=localReplicationFinished type=warm bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - event=finishBucketReplication bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 [et=1575639605 lt=1575639665 type=2]
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - Replicating warm bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 node=guid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 host=172.28.128.18 s2sport=9887 bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - Starting replication of bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 to 172.28.128.18:9887;
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
06.12.19 16:03:19,214
12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - event=startBucketReplication bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8
host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
Can you provide the splunkd logs
I have found the solution. There is a replication port for the communication between the indexers in a cluster. I have forgot to set a firewall rule for this port. After I set a rule for this port it worked.
But anyway, thank you for your interest.