Search head cluster is breaking when mgmt_uri has ...

wegscd · ‎01-07-2016

We lost one node out of a three node search head cluster. We went to static captaincy.

Sometime along the line, it appears that scheduled searches stopped working. Usually restarting one of the search heads got things going again, but right now the shcluster is in a mess.

The thing that always seems to accompany trouble with this thing is when mgmt_uri starts showing up in a 'show shcluster-status' as '?'.

Right now I have static captaincy transfer to node adculsplunkp6. a show shcluster-status there shows:

 Captain:
                  dynamic_captain : 0
                  elected_captain : Thu Jan  7 09:52:39 2016
                               id : F0214F20-327E-4591-ACC7-A03929CF829F
                 initialized_flag : 1
                            label : adculsplunkp6
                 maintenance_mode : 0
                         mgmt_uri : ?
            min_peers_joined_flag : 1
             rolling_restart_flag : 0
               service_ready_flag : 1

 Members: 
    adculsplunkp6
                            label : adculsplunkp6
                         mgmt_uri : ?
                   mgmt_uri_alias : https://xx.xx.xx.xxx:8089
                           status : Up
    adculsplunkp2
                            label : adculsplunkp2
                         mgmt_uri : ?
                   mgmt_uri_alias : https://xx.xx.xx.xx:8089
                           status : Up

On the other (non-captain), it's still shows a different captain and no member.

Captain:
                  dynamic_captain : 0
                  elected_captain : Thu Jan  7 10:01:16 2016
                               id : F0214F20-327E-4591-ACC7-A03929CF829F
                 initialized_flag : 1
                            label : adculsplunkp2
                 maintenance_mode : 0
                         mgmt_uri : ?
            min_peers_joined_flag : 1
             rolling_restart_flag : 0
               service_ready_flag : 1

 Members:

How do I get the correct mgmt_uris in there so things start behaving again?

gaurav_splunk · ‎08-24-2017

This issue has been fixed in 6.4.7 and 6.3.11, so feel free to upgrade your environment.

risgupta_splunk · ‎02-28-2017

The issue here is that, in case of static captaincy we read the mgmt_uri from memory. Hence, when we restarted the node, the value was lost and we did not read the value from disk/config. Hence the "?" in show shcluster-status command.

wegscd · ‎01-08-2016

I am going back to static because whenever I call Splunk support with my 2 node cluster, they tell me that I am in a unsupported configuration. That third node is gone, and they don't support 2 node clusters.

The saved searches issue was caused by 6.3.0 bug; apparently they started tracking # of running searches across the cluster, had a bug in it, so eventually the cluster figured everyone was over quota and stopped schedules searches. Details at https://answers.splunk.com/answers/329518/why-do-scheduled-searches-randomly-stop-running-in.html

nvanderwalt_spl · ‎01-07-2016

So you don't really need to go to static if you have 2/3 of the nodes available. Were you doing it as a preventative measure in case you lost another node? If so, It would only cover you if you lost the non-captain.

Did you run the configure both remaining nodes to use the same static captain? Did you use fully qualified domain names?

You can go back to dynamic captaincy by bootstrapping one of the members (preferably the old static captain), then convert the others.

See http://docs.splunk.com/Documentation/Splunk/6.3.0/DistSearch/Staticcaptain

One last thing, are all your saved searches failing, or only some? If it is only some, it could be due to the fact that you have fewer cores available to process, which would decrease number of searches you can run.

wegscd · ‎01-08-2016

I used the same static captain on both, and specified IP address (not my choice, the guy that set up the cluster did it that way).

Search head cluster is breaking when mgmt_uri has a question mark. How do I get the correct mgmt_uri?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

Search head cluster is breaking when mgmt_uri has a question mark. How do I get the correct mgmt_uri?

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard