Deployment Architecture

CM redundancy and cluster/manager/ha_active_status response

llopreiato
Explorer

Hello,

we're implementing a distributed clustered infrastructure with CM redundancy, between the indexers/search heads and the CMs there is a load balancer.

The mentioned LB can send requests to the ha_active_status endpoint but it needs a static response set to "understand" which CM is active.

The team working on the LB says the ha_active_status response is not static because it contains a timestamp so the status check doesn't work.

Any workarounds?

Thanks in advance

Luca

Labels (2)
0 Karma
1 Solution

livehybrid
Champion

Hi @llopreiato,

Configure the load balancer health check to rely on the HTTP status code returned by the ha_active_status endpoint, rather than parsing the response body.

The active Cluster Manager (CM) will return an HTTP 200 status code, while the standby CM(s) will return an HTTP 503 status code. This difference in status codes is the standard method for load balancers to determine the active node in this scenario.

The endpoint path is /services/cluster/manager/ha_active_status on the management port (default 8089).

    1. Active CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 200
    2. Standby CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 503

Most load balancers can be configured to check for a specific HTTP status code as the health check condition.

See the documentation on CM redundancy and health checks: https://docs.splunk.com/Documentation/Splunk/latest/Indexer/CMredundancy#:~:text=The%20active%20mana...

🌟 Did this answer help you? If so, please consider:

  • Adding kudos to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

View solution in original post

0 Karma

llopreiato
Explorer

Finally the LB team found a way to make it work with status codes.

I accepted the first answer as solution.

Thank you

Luca

0 Karma

llopreiato
Explorer

On the LB they keep saying it's not possible to check for the http status.

Is there any way to render the response static?

 

Luca

0 Karma

livehybrid
Champion

Hi @llopreiato 

Unfortunately no it isnt. The only supported way is via the status code - I cant really think of many other options either, you could put something like haproxy/nginx on the CM server to proxy the requests and modify the output but obviously wouldnt be a supported approach (and outside my area of expertise these days, sorry!)

 

🌟 Did this answer help you? If so, please consider:

  • Adding kudos to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing.

0 Karma

llopreiato
Explorer

Hi @livehybrid,

thanks for your answer, unfortunately the configuration you suggest is not available on the LB.

The LB configuration requires:
1. an Health Check URI

2. an Heath Check Receive String

 

Luca

0 Karma

livehybrid
Champion

Hi @llopreiato 

Unfortunately I think you might need to speak to the vendor of the LB to see if they can work out how to make it work based on HTTP Status code - this is fairly common behaviour for LB health checks so I'd be surprised if its not possible using your LB.

🌟 Did this answer help you? If so, please consider:

  • Adding kudos to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing.

0 Karma

livehybrid
Champion

Hi @llopreiato,

Configure the load balancer health check to rely on the HTTP status code returned by the ha_active_status endpoint, rather than parsing the response body.

The active Cluster Manager (CM) will return an HTTP 200 status code, while the standby CM(s) will return an HTTP 503 status code. This difference in status codes is the standard method for load balancers to determine the active node in this scenario.

The endpoint path is /services/cluster/manager/ha_active_status on the management port (default 8089).

    1. Active CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 200
    2. Standby CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 503

Most load balancers can be configured to check for a specific HTTP status code as the health check condition.

See the documentation on CM redundancy and health checks: https://docs.splunk.com/Documentation/Splunk/latest/Indexer/CMredundancy#:~:text=The%20active%20mana...

🌟 Did this answer help you? If so, please consider:

  • Adding kudos to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Get Updates on the Splunk Community!

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...

In today’s evolving threat landscape, we understand you’re constantly bombarded with phishing and malware ...

AppDynamics is now part of Splunk Ideas

Hello Splunkers, We have exciting news for you! AppDynamics has been added to the Splunk Ideas Portal. Which ...

Advanced Splunk Data Management Strategies

Join us on Wednesday, May 14, 2025, at 11 AM PDT / 2 PM EDT for an exclusive Tech Talk that delves into ...