Solved: KV Store failing at SHC

guimilare · ‎06-29-2017

Hi all.

We are running an environment with a SHC.
After upgrading to v6.6.1, the SH started showing the following message:

"KV Store changed status to failed. Failed to establish communication with KVStore. See splunkd.log for details."

Any ideas what may be the problem?

regards,
GMA

kunalmao · ‎11-02-2017

Exactly the same issue , and we realised that the issue is with mongo and not splunk.
To solve this you have to set the search head cluster as odd number , we had 4 sh cluster members, we reduced them to 3 by running

splunk remove shcluster-member

on any one of the node.
Now stop Splunk on all of them and run below command

./splunk clean kvstore --local

start splunk again and the error should be gone.

For more details on why Mongo requires odd numbers of nodes refer below

https://stackoverflow.com/questions/16150409/why-does-a-mongodb-replica-set-require-an-odd-number-of...

hope this helps

View solution in original post

kunalmao · ‎11-02-2017

Exactly the same issue , and we realised that the issue is with mongo and not splunk.
To solve this you have to set the search head cluster as odd number , we had 4 sh cluster members, we reduced them to 3 by running

splunk remove shcluster-member

on any one of the node.
Now stop Splunk on all of them and run below command

./splunk clean kvstore --local

start splunk again and the error should be gone.

For more details on why Mongo requires odd numbers of nodes refer below

https://stackoverflow.com/questions/16150409/why-does-a-mongodb-replica-set-require-an-odd-number-of...

hope this helps

Jarohnimo · ‎08-17-2019

I'd be careful using this on Enteprise Apps like ITSI and ES, this can erase all your relevant data. It's important to backup your KV Store and also Enterprise apps routinely. so in this scenario you could clean the store, restore the kv items and then the JSON files backed up from your enteprise app

koshyk · ‎11-02-2017

can you also try the mongod.log

oscarminassian · ‎11-29-2017

Thanks for the help guys and sorry for the delay in reply. Turns out we forgot to remove a few searchheads from the master node after a migration and upgrade so things were a little "muddled up". Once removed we did a rolling restart of the search head cluster and things seem to have returned to normal.

oscarminassian · ‎10-27-2017

Hi guys,

The exact same problem over here on 6.6.1. We have a SH-Cluster setup and I've managed to clear the issue from one search head by restarting the Splunkd service, only to have it move over to another search head in the cluster! Rince repeat, the issue will only ever be on one search head at a time. I haven't observed any down the line impact, but the error is annoying!

Oscar

kunalmao · ‎11-02-2017

It is getting moved to other search head as the issue resides on the search head captain. Even we are facing same issue.

Azeemering · ‎06-29-2017

What does the splunkd log say in detail?

guimilare · ‎06-29-2017

06-29-2017 18:00:07.014 +0000 ERROR KVStoreBulletinBoardManager - KV Store changed status to failed. Failed to establish communication with KVStore. See splunkd.log for details.

06-29-2017 18:06:04.242 +0000 WARN SHPConfReplicationBookmarkFromKVStore - writeSHCStateToStateStore: failed to write SHC state to KVstore:

KV Store failing at SHC

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases

Are you a member of the Splunk Community?

KV Store failing at SHC

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

Your summer travels continue with new course releases