Monitoring Splunk

Error: slave has more than 10 number of continous replication failures. Look at splunkd.log for details on why this is happening.

splunk_user_99
Engager

Hello,

I have created a distributed search environment. On my masternode im getting the error: replication factor and search factor not reached, because of the following errors on my indexers: "Error: slave has more than 10 number of continous replication failures. Look at splunkd.log for details on why this is happening."

So, does everyone know how to solve this problem? I have checked the connectivity between the indexers, and they are available one to each other. What else could it be? Any ideas?

Best regards.

Labels (2)
0 Karma

splunk_user_99
Engager

06.12.19 16:03:19,216

12-06-2019 16:03:19.216 +0100 INFO CMRepJob - job=CMReplicationErrorJob bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 failingGuid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 srcGuid=42C2AF4B-A702-44F4-B450-1D0156387DE8 tgtGuid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 succeeded

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO CMSlave - bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 src=42C2AF4B-A702-44F4-B450-1D0156387DE8 tgt=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 failing=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 queued replication error job

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO CMReplicationRegistry - Finished replication: bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 src=42C2AF4B-A702-44F4-B450-1D0156387DE8 target=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 WARN BucketReplicator - Failed to replicate warm bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 to guid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 host=172.28.128.18 s2sport=9887. Connection failed

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - Discarding replication data as QueueRef=guid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 host=172.28.128.18 s2sport=9887 bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 is deleted

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 WARN BucketReplicator - Connection failed

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 ERROR TcpOutputFd - Connection to host=172.28.128.18:9887 failed

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 WARN TcpOutputFd - Connect to 172.28.128.18:9887 failed. No route to host

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - event=localReplicationFinished type=warm bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - event=finishBucketReplication bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 [et=1575639605 lt=1575639665 type=2]

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - Replicating warm bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 node=guid=4CAFFC9C-BD32-437D-B603-DFA3BD15DC50 host=172.28.128.18 s2sport=9887 bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - Starting replication of bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8 to 172.28.128.18:9887;

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd

06.12.19 16:03:19,214   

12-06-2019 16:03:19.214 +0100 INFO BucketReplicator - event=startBucketReplication bucket bid=_audit~91~42C2AF4B-A702-44F4-B450-1D0156387DE8

host = indexer1
source = /opt/splunk/var/log/splunk/splunkd.log
sourcetype = splunkd
0 Karma

uagrawal_splunk
Splunk Employee
Splunk Employee

Can you provide the splunkd logs

0 Karma

splunk_user_99
Engager

I have found the solution. There is a replication port for the communication between the indexers in a cluster. I have forgot to set a firewall rule for this port. After I set a rule for this port it worked.

0 Karma

splunk_user_99
Engager

But anyway, thank you for your interest.

0 Karma
Get Updates on the Splunk Community!

Register to Attend BSides SPL 2022 - It's all Happening October 18!

Join like-minded individuals for technical sessions on everything Splunk!  This is a community-led and run ...

What's New in Splunk Cloud Platform 9.0.2208?!

Howdy!  We are happy to share the newest updates in Splunk Cloud Platform 9.0.2208! Analysts can benefit ...

Admin Console: A Single, Unified Interface for All Your Cloud Admin Needs

WATCH NOWJoin us to learn how the admin console can save you time and give you more control over the Splunk® ...