Deployment Architecture

Increase retry frequency for Search Heads to connect with Search Peers

twinspop
Influencer

We experience heavy search loads a few times a day. This can cause the Search Peers to drop from the Search Heads, resulting in the dreaded yellow triangles on dashboard panels. "Unable to distribute to peer..."

After the heavy load is gone, Search Heads often don't reconnect to the Peers, or at least not in a timely manner. However, if I drop into the settings and disable/enable, it always reconnects immediately.

Is there a way to increase the frequency of the retries? (Or enable it at all, because seriously, sometimes I've waited hours only to disable/enable and have it work immediately.) I found this in the spec file, but...

checkTimedOutServersFrequency = <integer, in seconds>
* This setting is no longer supported, and will be ignored
0 Karma

bosburn_splunk
Splunk Employee
Splunk Employee

I'd first concentrate on WHY you are having heavy search issues on the peers. If you have enterprise support, I'd open a ticket up and attach a diag so support can help you figure out whats going on.

0 Karma

twinspop
Influencer

Yes, we have enterprise support. One of our products' apps (big corp, many products, each with its own app) has scheduled all of their summary searches at the top of the hour. Hundreds of them. We're working on resolving this, but it has not happened yet. Still, heavy loads will happen on occasion. I'd like SHs to reconnect ASAP, not hours down the road.

0 Karma

twinspop
Influencer

I don't think it will. It's like the SHs are not event trying to reconnect. If I disable/enable the Peer in settings, it connects immediately. Setting longer timeouts on the connection process won't help if it's not trying. Thanks, tho.

0 Karma

linu1988
Champion

connectionTimeout =
* Amount of time in seconds to use as a timeout during search peer connection establishment

this stanza controls the timing settings for connecting to a remote peer and the send timeout
[replicationSettings]
connectionTimeout = 10
sendRcvTimeout = 60

will this not help?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...