Getting Data In

Peer unable to join cluster after 10 mins

Itzmeaj
Explorer

Hello all,

I am at a bit loss in what to do at this point. I had an indexer fail and now that my it is healthy I cannot rejoin the cluster reliably. After a reboot/restarting splunk he will join and begin syncing buckets for about 10 mins. After this he will throw errors and get stuck in a batch adding state on the indexer management page. I get this error on the master:

Failed to register with cluster master reason: failed method=POST path=/services/cluster/master/peers/?output_mode=json master=:8089 rv=0 gotConnectionError=0 gotUnexpectedStatusCode=1 actual_response_code=502 expected_response_code=2xx status_line="Bad request" socket_error="No error" remote_error= [ event=ReaddPeer status=retrying AddPeerRequest: { _id= active_bundle_id=9884CA425F0224F22F37BE784337C463 add_type=ReAdd base_generation_id=0 batch_serialno=1 batch_size=4 forwarderdata_rcv_port=9997 forwarderdata_use_ssl=0 last_complete_generation_id=0 latest_bundle_id=9884CA425F0224F22F37BE784337C463 mgmt_port=8089 name=6A1C1358-0C02-4D60-B58B-EA903E3D0991 register_forwarder_address= register_replication_address= register_search_address= replication_port=8080 replication_use_ssl=0 replications= server_name= site=default splunk_version=8.0.7 splunkd_build_number=1c4f3bbe1aea status=Up } ].

The indexer in question displays the same error, along with this one. I’m not sure if it’s related.

 

ERROR HTTPClientRequest: caught exception while parsing HTTP Reply: string value too long value size = 531110, Maxvaluesize = 524288

 

i should mention I did reinstall splunk over the currently installed version as part of fixing that indexer.

Labels (1)
0 Karma
1 Solution

Itzmeaj
Explorer

I was able to fix this issue by rebuilding the server, there was some corruption in the filesystem.

View solution in original post

0 Karma

Itzmeaj
Explorer

I was able to fix this issue by rebuilding the server, there was some corruption in the filesystem.

0 Karma

codebuilder
Influencer

After your reinstalled Splunk did you update the pass4SymmKey's in server.conf under [general] and [clustering]? Those have to match in order to join the cluster. Also, you'll need to enter the plain text version, don't copy the hashed pass4SymmKey from another node.

----
An upvote would be appreciated and Accept Solution if it helps!

Itzmeaj
Explorer

Thank you for the suggestion, I did verify this but it ended up being a different issue that I was able to resolve.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...