Search Head Cluster having Kvstore issue

sat94541 · ‎01-15-2016

We have four Search Head Cluster members and of the four, all of them have a mongo status that is not good or "Ready".

root@dctlapsrch01:~# curl -k -s https://localhost:8090/services/server/info | grep kvStore
<s:key name="kvStoreStatus">failed</s:key>
root@dctlapsrch01:~#

root@strlapsrch01:~# curl -k -s https://localhost:8090/services/server/info | grep kvStore
<s:key name="kvStoreStatus">starting</s:key>
root@strlapsrch01:~#

root@strlapsrch02:~# curl -k -s https://localhost:8090/services/server/info | grep kvStore
<s:key name="kvStoreStatus">starting</s:key>
root@strlapsrch02:~#

root@strlapsrch03:~# curl -k -s https://localhost:8090/services/server/info | grep kvStore
<s:key name="kvStoreStatus">starting</s:key>
root@strlapsrch03:~#

We have tried doing a rolling restart but all members are stuck on their current state.

We are also seeing some errors in the mongod.log file like this:

2016-01-15T02:01:54.263Z E REPL [conn201] replSet replSetInitiate failed; CannotInitializeNodeWithData 'test.machine.com:8191' has data already, cannot initiate set.
2016-01-15T02:01:54.263Z W REPL [ReplicationExecutor] 'test.machine.com:8191' has data already, cannot initiate set.

sat94541 · ‎01-15-2016

Isue got resolved-KVStore stick in a status of starting.

ridwanahmed · ‎11-06-2018

how did it resolve? aka, what did you do to get it out of "starting"?

justin89 · ‎12-09-2019

It would have been helpful to post the resolution. I am also dealing with this issue.

rbal_splunk · ‎01-15-2016

Verify that KVStore is enabled on all instances.

Verify that EACH member has access to all other members (you can use telnet host port to do that). We have seen several times, when just one member has some wrong routing and because of that KVStore cannot elect the captain. One of the reasons - wrong records in /etc/hosts.

KVStore depends on SHC, if SHC is in bad state - KVStore will not work properly. So first verify that SHC is available, all members are listed and SHC has an elected Captain.

splunk show shcluster-status

After you will find out who is the SHC captain it is good to start investigation from the SHC Captain what is wrong with the KVStore cluster.

Check logs
$SPLUNK_HOME/var/log/splunk/mongod.log
$SPLUNK_HOME/var/log/splunk/splunkd.log

The first one (mongod.log) can have information about why mongod could not be started or crash log. The second one (splunkd.log) can have information about why splunk could not launch mongod or why it changed status of KVStore to failed.

Search Head Cluster having Kvstore issue

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Splunk Community Badges!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

Join the Conversation

Search Head Cluster having Kvstore issue

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Splunk Community Badges!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions