Deployment Architecture

What are my options to track the captain switch over time in a search head cluster?

Splunk Employee
Splunk Employee

I have a 6 node Search Head Cluster deployment. What are my options to figure out the Captain switch over time?

0 Karma
1 Solution

Splunk Employee
Splunk Employee

Here are two options that will work for you

Option 1: Use Splunkd.log and use a search like “SHPoolingMgr - Making node the captain”. Only the Node with this message will be the Captain.

index=_internal  (host=SH1  OR   host=SH2   OR host=SH3  OR host=SH4  OR host=SH5  OR host=SH6) sourcetype=splunkd "Making node the captain"   | table host _raw

Result:
alt text

Option 2: You can search metrics.log on a node to check if it has been the Captain. For example:

08-31-2016 18:27:07.094 +0000 INFO Metrics - group=captainstability, stable_follower_pct=0, stable_captain_pct=100, num_polled_captain=155, num_polled_follower=0, num_polled_candidate=0, upgrades_to_captain=0, downgrades_from_captain=0, captain_changes=0

This changes every poll period, so maybe look for the stable_captain_pct > 0 during any time frame shows that was captain during that time.

Search like:

index=_internal (host=SH1  OR   host=SH2   OR host=SH3  OR host=SH4  OR host=SH5  OR host=SH6)   sourcetype=metrics  stable_captain_pct > 0 | timechart count by host

alt text

View solution in original post

Explorer

You could also script it using:
/opt/splunk/bin/splunk show shcluster-status

0 Karma

SplunkTrust
SplunkTrust

Use the distributed management console (DMC). I have it enabled on our deployer.

In DMC

search -> search head clustering -> status and configuration

Look for the captain election activity (it is a panel) and captain selection details (a panel to the right)

Splunk Employee
Splunk Employee

Here are two options that will work for you

Option 1: Use Splunkd.log and use a search like “SHPoolingMgr - Making node the captain”. Only the Node with this message will be the Captain.

index=_internal  (host=SH1  OR   host=SH2   OR host=SH3  OR host=SH4  OR host=SH5  OR host=SH6) sourcetype=splunkd "Making node the captain"   | table host _raw

Result:
alt text

Option 2: You can search metrics.log on a node to check if it has been the Captain. For example:

08-31-2016 18:27:07.094 +0000 INFO Metrics - group=captainstability, stable_follower_pct=0, stable_captain_pct=100, num_polled_captain=155, num_polled_follower=0, num_polled_candidate=0, upgrades_to_captain=0, downgrades_from_captain=0, captain_changes=0

This changes every poll period, so maybe look for the stable_captain_pct > 0 during any time frame shows that was captain during that time.

Search like:

index=_internal (host=SH1  OR   host=SH2   OR host=SH3  OR host=SH4  OR host=SH5  OR host=SH6)   sourcetype=metrics  stable_captain_pct > 0 | timechart count by host

alt text

View solution in original post

Explorer

Heyo! Stumbled across this thread and thought I'd offer up an easier alternative than using the internal or metrics logging, if you've got privileges to hit the REST endpoints.

You can also use this to output the captain info via SPL:

| rest /services/shcluster/status splunk_server=local

 

Which outputs the captain information in the captain.* fields. We use `splunk_server=local` to avoid trying to query other SHC / IDX members for captain info (which will throw errors), since we only need information from the SH we're running it on.
See: https://docs.splunk.com/Documentation/Splunk/latest/RESTREF/RESTcluster#shcluster.2Fstatus

0 Karma