Solved: Re: We are building a new indexer cluster and gett...

sathwik067 · ‎11-16-2020

Hello all,

We are trying to build new indexer cluster with new cluster master. We installed splunk on all the servers and integrated indexers with the cluster master. After all the process we are getting search and replication factor errors with below warning messages. We check all the ports connectivity between all the indexers and the cluster master everything is connected but we still getting this warning. We tried cleaning up the eventdata as suggested in one of the posts but that did not work either. Please let me know if anyone faced this type of issue and resolved it that would be very helpful. Let me know if you need any more info.

We have search and replication factor = 2 with three indexers

Search peer abcd.com has the following message: Too many bucket replication errors to target peer=xx.xx.xx.xx:8080. Will stop streaming data from hot buckets to this target while errors persist. Check for network connectivity from the cluster peer reporting this issue to the replication port of target peer. If this condition persists, you can temporarily put that peer in manual detention.

Thanks.

sathwik067 · ‎12-04-2020

Hello all,

The problem is the MTU setting on the 1 Gb bonded network interface is set to 9,000 on our new indexes. We changed it to 1500 and that fixed the search and replication factor.

Thanks.

View solution in original post

sathwik067 · ‎12-04-2020

Hello all,

The problem is the MTU setting on the 1 Gb bonded network interface is set to 9,000 on our new indexes. We changed it to 1500 and that fixed the search and replication factor.

Thanks.

DarshanBK · ‎05-22-2023

@sathwik067 How to check the MTU seeting?
Is it something that needs to be done at splunk end or network end?

sathwik067 · ‎05-23-2023

Hi @DarshanBK, if you are running Linux, your Linux team can make this change, there is nothing you can do on the Splunk end. To see the MTU value you can run "ifconfig" command and it will give you the output of this MTU value along with some other information about the server and the mounts it has.

richgalloway · ‎11-16-2020

Have you checked the connectivity among the individual indexers? Replication is direct from indexer to indexer - not via the CM - so it's critical for an indexer to be able to connect to all other indexers and not just the CM.

---
If this reply helps you, Karma would be appreciated.

sathwik067 · ‎11-16-2020

Hello,

Thanks for the response. Yes, we have checked the connectivity between the indexers as well and the ports are connected between the indexers.

isoutamo · ‎11-16-2020

What error messages you found from splunkd.log on all those servers?

sathwik067 · ‎11-16-2020

Hello,

thanks for the response. below are some of the errors we are seeing on the indexers

"ERROR TcpInputProc - Error encountered for connection from src=xx.xx.xx.xx:xxxx. Read Timeout Timed out after 600 seconds."

"BucketReplicator - Failed to replicate warm bucket bid=_internal~xx to guid=ABCD host=xx.xx.xx.xx s2sport=8080. Read timed out after 180 secs."

Missing enough suitable candidates to create searchable copy in order to meet replication policy. Missing={ default:1

Waiting 'target_wait_time' before search factor fixup
Cannot fix search count as the bucket hasn't rolled yet.

Search peer abcd.com has the following message: Too many bucket replication errors to target peer=xx.xx.xx.xx:8080. Will stop streaming data from hot buckets to this target while errors persist. Check for network connectivity from the cluster peer reporting this issue to the replication port of target peer. If this condition persists, you can temporarily put that peer in manual detention

isoutamo · ‎11-18-2020

What you get if you are trying from one peer to another
curl -v telnet://<peer name/ip>:8080

sathwik067 · ‎11-18-2020

It is getting connected
curl -v telnet://xx.xx.xx.xx:8080
* About to connect() to xx.xx.xx.xx port 8080 (#0)
* Trying xx.xx.xx.xx...
* Connected to xx.xx.xx.xx (xx.xx.xx.xx) port 8080 (#0)

Why are we receiving search and replication factor errors while ee are building a new indexer cluster?

indexer

Linux

Application management with Targeted Application Install for Victoria Experience

Index This | What goes up and never comes down?

Splunkers, Pack Your Bags: Why Cisco Live EMEA is Your Next Big Destination

Join the Conversation

Why are we receiving search and replication factor errors while ee are building a new indexer cluster?

indexer

Linux

Application management with Targeted Application Install for Victoria Experience

Index This | What goes up and never comes down?

Splunkers, Pack Your Bags: Why Cisco Live EMEA is Your Next Big Destination