Getting Data In

Why is the splunkd.log reporting lots of "DistributedPeerManager - Unable to distribute to peer named...because peer has status = "Down"."?

lisaac
Path Finder

I have a very busy search head that complains :

DistributedPeerManager - Unable to distribute to peer named slxxxxxxxxx:9089 at uri https://xxxxxxxx037:9089 because peer has status = "Down" 

The messages will start in splunkd.log at 22:08:10.971 and finish at
22:09:46.994, but the message is reported about 60 times during short time period. A telnet from the SH to the indexer on 9089 shows no connectivity issues.

This has happened off and on for all indexers configured in distributed search. I am wondering if there is a setting that could be adjusted that to prevent these messages from occurring, or if there is a conf value that could be adjust to improve performance under high load. The SH is 10vpcus by 32gig, and there is a high load average on the SH and indexers (lots of searches).

There appears to be no negative impact to the messages, since searches are working. Users are not reporting any issues.

0 Karma

muebel
SplunkTrust
SplunkTrust

Hi lisaac, Based on the busyness of the hosts involved in the search, it seems reasonable that there could be momentary periods of high latency that could generate these messages. There are various timeout settings described in http://docs.splunk.com/Documentation/Splunk/6.3.0/Admin/Distsearchconf that could adjust the environment's expectations, for instance:

# this stanza controls the timing settings for connecting to a remote peer and
# the send timeout
[replicationSettings]
connectionTimeout = 10
sendRcvTimeout = 60

stevepraz
Path Finder

Did this start happening after a recent 6.3 upgrade? What platform are you running?

I've seen this message recently too following my 6.3 and some new app installs.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Introducing Splunk 10.0: Smarter, Faster, and More Powerful Than Ever

Now On Demand Whether you're managing complex deployments or looking to future-proof your data ...

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...