Deployment Architecture

SHC member nodes status flickering on the 'indexer clustering' page

rahul_bhatia
New Member

Hello,

We are running Splunk version 7.1.3.

We have 2 SHCs connected to our indexers. For one of the SHCs, the SHC members keep flickering between 'Up' and 'Down' status on the 'Indexer Clustering' page.

One of the previous posts suggested to increase 'generation_poll_interval' from 5 to 60 seconds. In our case, for members of both SHCs, 'generation_poll_interval' defaults to 5. The flickering status only happens for members of one SHC, and not the other.

Any further inputs on this behavior would be appreciated.

Thanks

Tags (2)
0 Karma
1 Solution

nickhills
Ultra Champion

You must be seeing errors in _internal for the SHC members which are at fault.
Can you post some of the messages you see?

If my comment helps, please give it a thumbs up!

View solution in original post

0 Karma

codebuilder
SplunkTrust
SplunkTrust

Based on the information you supplied, I suspect that you are running into a split-brain situation.
Search head clustering should include no fewer than 3 nodes.
The three nodes make a "decision" on who should be captain based on "votes".
When you have only two, it becomes nearly impossible for them to agree/elect the leader, (quorum) and will lead to the situation you describe.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

nickhills
Ultra Champion

I initially read it that way too, but i think the question means 2 seperate SH clusters of x nodes.
Given the minimums you corectly state, that means at least 6 search head members, split across 2 SHCs.
At least thats my assumption..

If my comment helps, please give it a thumbs up!
0 Karma

codebuilder
SplunkTrust
SplunkTrust

Yes, that is correct. Though it is technically possible to cluster two nodes, it is not good practice and leads to these type of issues. You need at least 3 nodes per SHC. Otherwise, you'll continue to have split-brain issues.

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

codebuilder
SplunkTrust
SplunkTrust

For the record, split-brain is not unique to Splunk. You'll encounter it in any type of clustering with only two nodes. Two nodes can't establish quorum successfully (more often than not).

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

nickhills
Ultra Champion

You must be seeing errors in _internal for the SHC members which are at fault.
Can you post some of the messages you see?

If my comment helps, please give it a thumbs up!
0 Karma

rahul_bhatia
New Member

Hi Nick,

So I am seeing the following message for one of the search peers:

ERROR DistributedPeerManagerHeartbeat - Status 502 while sending public key to cluster search peer
WARN DistributedPeerManagerHeartbeat - Send failure while pushing PK to search peer, Connect Timeout

Apparently, the SHC member nodes cannot connect to just this search peer on port 8089. It seems this is the culprit which is causing the fluctuations in the status.

I will get this rectified and see if this alleviates the problem.

Thanks!

0 Karma

nickhills
Ultra Champion

Sounds promising. Good luck

If my comment helps, please give it a thumbs up!
0 Karma

nickhills
Ultra Champion

If my answer helped, please consider accepting and/or upvoting so that other memebers of the community can see it was useful.

If my comment helps, please give it a thumbs up!
0 Karma

rahul_bhatia
New Member

As an update, there is a communication issue between the SHC nodes and just one indexer out of 46 that we have.

This seems to be causing the fluctuation in the status.

Thanks for your responses. This has been marked as 'Accepted'.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...