Deployment Architecture

Splunk Indexer services admin bundle errors

govindashwin8
Loves-to-Learn Lots

I have created a new cluster with a Master, a search head, indexers and forwarders, similar to the architecture here https://docs.splunk.com/Documentation/Splunk/8.1.0/Indexer/Clusterdeploymentoverview

I am unable to search for anything on the search head. Indexers are receiving messages from forwarders.
I see the following message on the search head 

unable to distribute to peer named index- ... . I have verified that the indexer can accept connections from the search head and I can see that both are up on the master (status up/ indexes displayed/ buckets displayed)

However, when i look at search peers on the Master and the Seach Head I see these messages.

REST interface to peer is not responding. Check var/log/splunk/splunkd_access.log on the peer.

On the access logs in the indexer I can see warnings for both the master and the search head :-

GET /services/admin/bundles/search- ... 404
GET /services/admin/bundles/master- ... 404

Thanks.

Labels (3)
0 Karma

isoutamo
SplunkTrust
SplunkTrust
Hi
what this command gives to you on your search head (runs as your splunk user)?
splunk show config server|egrep '^(clustering|master_uri|mode)'
r. Ismo
0 Karma

govindashwin8
Loves-to-Learn Lots

@isoutamo  

splunk show config server | egrep '^(clustering|master_uri|mode)'
master_uri=https://<master-ip>:8089
mode=searchhead
master_uri=https://<master-ip>:8089
master_uri=https://<license-server-ip>

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Thanks, those seems to be reasonable.
How about CM's or Peers splunkd.log, are there any issues e.g. with authentication?
0 Karma

govindashwin8
Loves-to-Learn Lots

@isoutamo 

On the peer/indexer i see these messages related to the master although i didn't see this continuously.

11-11-2020 02:34:54.913 +0000 ERROR CMSlave - heartbeat failure (reason: failed method=POST path=/services/cluster/master/peers/02697231-3BE7-4C09-8FB1-408B223A9466/?output_mode=json master=10.236.16.204:8089 rv=0 gotConnectionError=1 gotUnexpectedStatusCode=0 actual_response_code=502 expected_response_code=2xx status_line="Error connecting: Connection refused" socket_error="Connection refused" remote_error=)
11-11-2020 02:34:57.922 +0000 WARN  CMMasterProxy - Master is down! Make sure pass4SymmKey is matching if master is running.
11-11-2020 02:35:06.644 +0000 WARN  CMMessages - got genid thats invalid or out of range, setting to INVALID_GENID, jn=18446744073709552000
11-11-2020 09:04:53.119 +0000 ERROR CMSlave - heartbeat failure (reason: failed method=POST path=/services/cluster/master/peers/02697231-3BE7-4C09-8FB1-408B223A9466/?output_mode=json master=10.236.16.204:8089 rv=0 gotConnectionError=1 gotUnexpectedStatusCode=0 actual_response_code=502 expected_response_code=2xx status_line="Error connecting: Connection refused" socket_error="Connection refused" remote_error=)
11-11-2020 09:04:56.130 +0000 WARN  CMMasterProxy - Master is down! Make sure pass4SymmKey is matching if master is running.
11-11-2020 09:05:07.773 +0000 WARN  CMMessages - got genid thats invalid or out of range, setting to INVALID_GENID, jn=18446744073709552000

  

On the master I see this related the peer,

11-13-2020 06:17:10.991 +0000 WARN  GetBundleListTransaction - Server index-hio-1001264746-7-1061744060[https://10.236.18.39:8089/services/admin/bundles/master-1001264746-1-1061363192] does not support bundle version listing. Probably an older version. Giving up due to error code 404.
11-13-2020 06:17:10.991 +0000 WARN  DistributedPeer - Peer:https://10.236.18.39:8089 Unable to get bundle list
11-13-2020 06:18:10.994 +0000 WARN  GetBundleListTransaction - Server index-hio-1001264746-7-1061744060[https://10.236.18.39:8089/services/admin/bundles/master-1001264746-1-1061363192] does not support bundle version listing. Probably an older version. Giving up due to error code 404.
11-13-2020 06:18:10.994 +0000 WARN  DistributedPeer - Peer:https://10.236.18.39:8089 Unable to get bundle list
11-13-2020 06:19:10.990 +0000 WARN  GetBundleListTransaction - Server index-hio-1001264746-7-1061744060[https://10.236.18.39:8089/services/admin/bundles/master-1001264746-1-1061363192] does not support bundle version listing. Probably an older version. Giving up due to error code 404.
11-13-2020 06:19:10.990 +0000 WARN  DistributedPeer - Peer:https://10.236.18.39:8089 Unable to get bundle list

 

Thanks.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

As this said:

11-11-2020 02:34:54.913 +0000 ERROR CMSlave - heartbeat failure (reason: failed method=POST path=/services/cluster/master/peers/02697231-3BE7-4C09-8FB1-408B223A9466/?output_mode=json master=10.236.16.204:8089 rv=0 gotConnectionError=1 gotUnexpectedStatusCode=0 actual_response_code=502 expected_response_code=2xx status_line="Error connecting: Connection refused" socket_error="Connection refused" remote_error=)
11-11-2020 02:34:57.922 +0000 WARN  CMMasterProxy - Master is down! Make sure pass4SymmKey is matching if master is running.

Have you checked that those pass4SymmKeys are matching on all nodes on clustering stanza?

r. Ismo 

0 Karma

govindashwin8
Loves-to-Learn Lots

Yes @isoutamo , the symmkeys are the same across the all nodes. I see these errors on the search node and indexers

Searchhead

erver index-hio-1001264746-6-1061744054[https://10.236.18.41:8089/services/admin/bundles/search-1001264746-3-1062710635] does not support bundle version listing. Probably an older version. Giving up due to error code 404.
11-14-2020 02:58:43.205 +0000 WARN  DistributedPeer - Peer:https://10.236.18.41:8089 Unable to get bundle list

Master

Error [00000100] Instance name "index-hio-1001264746-6-1061744054" REST interface to peer is not responding. Check var/log/splunk/splunkd_access.log on the peer. Last Connect Time:2020-11-14T03:02:48.000+00:00; Failed 11 out of 11 times.

 

11-14-2020 03:04:43.161 +0000 WARN  DistributedPeerManager - Unable to distribute to peer named index-hio-1001264746-6-1061744054 at uri=10.236.18.41:8089 using the uri-scheme=https because peer has status=Down. Verify uri-scheme, connectivity to the search peer, that the search peer is up, and that an adequate level of system resources are available. See the Troubleshooting Manual for more information.
11-14-2020 03:04:48.169 +0000 WARN  GetBundleListTransaction - Server index-hio-1001264746-6-1061744054[https://10.236.18.41:8089/services/admin/bundles/master-1001264746-1-1061363192] does not support bundle version listing. Probably an older version. Giving up due to error code 404.

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

I suppose that this is not your production environment? If it is then contact to splunk support to get this fixed asap.

Check that there is no firewall / iptables which blocks cluster traffic. 

If not then I try the next:

  1. Remove cluster on SH side
  2. Stop all instances
  3. Switch all pass4SymmKeys on all node to the new plaintext version
  4. Start CM
  5. Start peers one by one and check that those could connect to CM and finally cluster is working
  6. Start SH
  7. Add CM as a new cluster to SH

r. Ismo

0 Karma

govindashwin8
Loves-to-Learn Lots

Thanks @isoutamo . Unfortunately that didn't help.

Still stuck with these errors on my indexer logs

GET /services/admin/bundles/master ---  404

This on my Master - 

Upload bundle="/data/splunk/var/run/master-1001264746-1-1061363192-1605502167.bundle" to peer name=index-hio-1001264746-5-1061744048 uri=---------------- failed; http_status=400 http_description="BaseException"

And in the distibuted search panel i see a similar message for all indexers.

Error [00000100] Instance name "index-hio-1001264746-2-1061744028" REST interface to peer is not responding.

0 Karma

athorat
Communicator

How was this resolved? 

We are seeing this on the CM a lot and the search performance and also the indexer performance is affected.

A support ticket since a month yet no resolutiom

Problem replicating config (bundle) to search peer ' <Peer-Name>:8089 ', Upload bundle="/opt/splunk/var/run/Cluster-Master.XXX.XXX-1628153272.bundle" to peer name=<Peer-name>: uri=https://<Peer-name>:8089 failed; http_status=400 http_description="BaseException".

0 Karma

ketilolav
Explorer

Hi @govindashwin8 , 

What version are you running on. I'm having bundle issues with indexers running Splunk 8.0, but not with indexers running 8.2.1. My search heads are running 8.2.1 

 

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...