Installation

Why are we receiving search and replication factor errors while ee are building a new indexer cluster?

sathwik067
Explorer

Hello all,

We are trying to build new indexer cluster with new cluster master. We installed splunk on all the servers and integrated indexers with the cluster master. After all the process we are getting search and replication factor errors with below warning messages. We check all the ports connectivity between all the indexers and the cluster master everything is connected but we still getting this warning. We tried cleaning up the eventdata as suggested in one of the posts but that did not work either. Please let me know if anyone faced this type of issue and resolved it that would be very helpful. Let me know if you need any more info.

We have search and replication factor = 2 with three indexers

Search peer abcd.com has the following message: Too many bucket replication errors to target peer=xx.xx.xx.xx:8080. Will stop streaming data from hot buckets to this target while errors persist. Check for network connectivity from the cluster peer reporting this issue to the replication port of target peer. If this condition persists, you can temporarily put that peer in manual detention.

Thanks.

Labels (2)
0 Karma
1 Solution

sathwik067
Explorer

Hello all,

 

The problem is the MTU setting on the 1 Gb bonded network interface is set to 9,000 on our new indexes.  We changed it to 1500 and that fixed the search and replication factor. 

 

Thanks.

View solution in original post

0 Karma

sathwik067
Explorer

Hello all,

 

The problem is the MTU setting on the 1 Gb bonded network interface is set to 9,000 on our new indexes.  We changed it to 1500 and that fixed the search and replication factor. 

 

Thanks.

0 Karma

DarshanBK
Explorer

@sathwik067 How to check the MTU seeting?
Is it something that needs to be done at splunk end or network end?

0 Karma

sathwik067
Explorer

Hi @DarshanBK, if you are running Linux, your Linux team can make this change, there is nothing you can do on the Splunk end. To see the MTU value you can run "ifconfig" command and it will give you the output of this MTU value along with some other information about the server and the mounts it has.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Have you checked the connectivity among the individual indexers?  Replication is direct from indexer to indexer - not via the CM - so it's critical for an indexer to be able to connect to all other indexers and not just the CM.

---
If this reply helps you, Karma would be appreciated.

sathwik067
Explorer

Hello,

Thanks for the response. Yes, we have checked the connectivity between the indexers as well and the ports are connected between the indexers.

0 Karma

isoutamo
SplunkTrust
SplunkTrust
What error messages you found from splunkd.log on all those servers?
0 Karma

sathwik067
Explorer

Hello,

thanks for the response. below are some of the errors we are seeing on the indexers

"ERROR TcpInputProc - Error encountered for connection from src=xx.xx.xx.xx:xxxx. Read Timeout Timed out after 600 seconds."
"BucketReplicator - Failed to replicate warm bucket bid=_internal~xx to guid=ABCD host=xx.xx.xx.xx s2sport=8080. Read timed out after 180 secs."
 
Missing enough suitable candidates to create searchable copy in order to meet replication policy. Missing={ default:1 
Waiting 'target_wait_time' before search factor fixup
Cannot fix search count as the bucket hasn't rolled yet. 

 

Search peer abcd.com has the following message: Too many bucket replication errors to target peer=xx.xx.xx.xx:8080. Will stop streaming data from hot buckets to this target while errors persist. Check for network connectivity from the cluster peer reporting this issue to the replication port of target peer. If this condition persists, you can temporarily put that peer in manual detention

0 Karma

isoutamo
SplunkTrust
SplunkTrust
What you get if you are trying from one peer to another
curl -v telnet://<peer name/ip>:8080
0 Karma

sathwik067
Explorer

It is getting connected 
curl -v telnet://xx.xx.xx.xx:8080
* About to connect() to xx.xx.xx.xx port 8080 (#0)
* Trying xx.xx.xx.xx...
* Connected to xx.xx.xx.xx (xx.xx.xx.xx) port 8080 (#0)

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...