Solved: CM redundancy and cluster/manager/ha_active_status...

llopreiato

Hello,

we're implementing a distributed clustered infrastructure with CM redundancy, between the indexers/search heads and the CMs there is a load balancer.

The mentioned LB can send requests to the ha_active_status endpoint but it needs a static response set to "understand" which CM is active.

The team working on the LB says the ha_active_status response is not static because it contains a timestamp so the status check doesn't work.

Any workarounds?

Thanks in advance

Luca

livehybrid

Hi @llopreiato,

Configure the load balancer health check to rely on the HTTP status code returned by the ha_active_status endpoint, rather than parsing the response body.

The active Cluster Manager (CM) will return an HTTP 200 status code, while the standby CM(s) will return an HTTP 503 status code. This difference in status codes is the standard method for load balancers to determine the active node in this scenario.

The endpoint path is /services/cluster/manager/ha_active_status on the management port (default 8089).

Active CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 200
Standby CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 503

Most load balancers can be configured to check for a specific HTTP status code as the health check condition.

See the documentation on CM redundancy and health checks: https://docs.splunk.com/Documentation/Splunk/latest/Indexer/CMredundancy#:~:text=The%20active%20mana...

🌟 Did this answer help you? If so, please consider:

Adding kudos to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

View solution in original post

llopreiato

Finally the LB team found a way to make it work with status codes.

I accepted the first answer as solution.

Thank you

Luca

llopreiato

On the LB they keep saying it's not possible to check for the http status.

Is there any way to render the response static?

Luca

livehybrid

Hi @llopreiato

Unfortunately no it isnt. The only supported way is via the status code - I cant really think of many other options either, you could put something like haproxy/nginx on the CM server to proxy the requests and modify the output but obviously wouldnt be a supported approach (and outside my area of expertise these days, sorry!)

🌟 Did this answer help you? If so, please consider:

Adding kudos to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing.

llopreiato

Hi @livehybrid,

thanks for your answer, unfortunately the configuration you suggest is not available on the LB.

The LB configuration requires:
1. an Health Check URI

2. an Heath Check Receive String

Luca

livehybrid

Hi @llopreiato

Unfortunately I think you might need to speak to the vendor of the LB to see if they can work out how to make it work based on HTTP Status code - this is fairly common behaviour for LB health checks so I'd be surprised if its not possible using your LB.

🌟 Did this answer help you? If so, please consider:

Adding kudos to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing.

livehybrid

Hi @llopreiato,

Configure the load balancer health check to rely on the HTTP status code returned by the ha_active_status endpoint, rather than parsing the response body.

The active Cluster Manager (CM) will return an HTTP 200 status code, while the standby CM(s) will return an HTTP 503 status code. This difference in status codes is the standard method for load balancers to determine the active node in this scenario.

The endpoint path is /services/cluster/manager/ha_active_status on the management port (default 8089).

Active CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 200
Standby CM: https://:8089/services/cluster/manager/ha_active_status -> HTTP 503

Most load balancers can be configured to check for a specific HTTP status code as the health check condition.

See the documentation on CM redundancy and health checks: https://docs.splunk.com/Documentation/Splunk/latest/Indexer/CMredundancy#:~:text=The%20active%20mana...

🌟 Did this answer help you? If so, please consider:

Adding kudos to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

CM redundancy and cluster/manager/ha_active_status response

indexer clustering

search head clustering

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...

AppDynamics is now part of Splunk Ideas

Advanced Splunk Data Management Strategies