Deployment Architecture

Splunk Distributed search peer not working as expected. There are multiple errors logs

umeshagarwal008
Explorer

Hi All,

We have 4 search head (non clustered) and 16 search peers (non clustered) . Each search head points to all 16 search peers.

Recently one of our search head was getting freeze and no search was working. So we tried disabling and enabling the search peers the problem was still the same. So while testing we disabled first three search peers and the search started working.

But now though the search is working but when we try to enable any one or all three disabled search peers the search head again gets freeze and no search works.

I have tried restarting the search head and peers but no improvement.
Deleted and added the search peers in config file from server still no improvement.

Below are the errors logs i have noted on search head for those peers. All the disabled peers have similar error logs:

01-02-2019 02:12:06.146 +0100 WARN DistributedPeerManager - Unable to distribute to peer named
at uri= using the uri-scheme=https because peer has status="Down". Please verify uri-scheme,
connectivity to the search peer, that the search peer is up, and an adequate level of system resources are available.
See the Troubleshooting Manual for more information.

01-02-2019 02:11:35.352 +0100 WARN DistributedPeer - Peer:
Unable to get server info from services/server/info due to:
Connect Timeout; exceeded 10000 milliseconds

01-02-2019 02:10:24.314 +0100 INFO StatusMgr - destHost=, destIp=, destPort=9997,
eventType=connect_fail, publisher=tcpout, sourcePort=8089, statusee=TcpOutputProcessor

01-02-2019 01:38:01.074 +0100 WARN DistributedBundleReplicationManager - replicateDelta: failed for peer=,
uri=,
cur_time=1546386051, cur_checksum=1546386051, prev_time=1546381229, prev_checksum=4121658182606070965,
delta=/opt/splunk/var/run/-1546381229-1546386051.delta

01-02-2019 01:38:01.074 +0100 ERROR DistributedBundleReplicationManager - Reading reply to upload: rv=-2,
Receive from= timed out; exceeded 60sec,
as per=distsearch.conf/[replicationSettings]/sendRcvTimeout

01-02-2019 01:36:04.709 +0100 WARN DistributedPeerManager - Unable to distribute to peer named at
uri because replication was unsuccessful.
replicationStatus Failed failure info: failed_because_BUNDLE_DATA_TRANSMIT_FAILURE

11-07-2018 10:51:07.007 +0100 WARN DistributedPeer - Peer: Unable to get bundle list

11-27-2018 20:12:16.688 +0100 WARN DistributedPeer - Peer:
Unable to get server info from /services/server/info due to: No route to host

Any kind of help will be really helpful.

  • Umesh
0 Karma
1 Solution

umeshagarwal008
Explorer

Sorry for being late on this. I was able to solve this by copying the bundles from working indexer to non working indexer.

View solution in original post

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...