Getting Data In

KV Store initialization has been not completed yet in SHC

Mfmahdi
Explorer

Dears,,,

The KV Store initialization on our search head cluster was previously working fine. However, unexpectedly, we are now encountering the error: "KV Store initialization has not been completed yet", and the KV Store status shows as "starting."

I attempted a rolling restart across the search heads, but the issue persists.
Kindly provide your support to resolve this issues

 @gcusello 

@woodcock 

Thank you in advance.

Capture out2.PNG

Capture out1.PNG

Labels (1)
0 Karma
1 Solution

livehybrid
Super Champion

Hi @Mfmahdi 

Please do not tag/call out specific users on here - there are lots of people monitoring for questions being raised and those you have tagged do have day jobs and other priorities so you risk your question being missed.

To troubleshoot the KV Store initialization issue, start by examining the logs on the search head cluster members for specific errors.

| rest /services/kvstore/status
| fields splunk_server, current*

Then check on each SHC member:

ps -ef | grep mongod

# Check mongod logs for errors
tail -n 200 $SPLUNK_HOME/var/log/splunk/mongod.log

# Check splunkd logs for KV Store related errors
grep KVStore $SPLUNK_HOME/var/log/splunk/splunkd.log | tail -n 200

 

  1. Verify mongod Process: Ensure the mongod process, which underlies the KV Store, is running on the search head members. Use the ps command or your operating system's equivalent. If it's not running, investigate why using the logs.
  2. Check Cluster Health: Ensure the search head cluster itself is healthy using the Monitoring Console or the CLI command splunk show shcluster-status run from the captain. KV Store issues can sometimes be symptomatic of underlying cluster communication problems. From your screenshot it looks like this is showing as starting state, so hopefully the logs shine some light on the issue.
  3. Check Resources: Verify sufficient disk space, memory, and CPU resources on the search head cluster members, particularly on the node currently acting as the KV Store primary.

  4. Focus on the error messages found in mongod.log and splunkd.log as they usually pinpoint the root cause (e.g., permissions, disk space, configuration errors, corrupted files).

  5. If the logs indicate corruption or persistent startup failures that restarts don't resolve, you may need to consider more advanced recovery steps, potentially involving Splunk Support.

USeful docs which might help:

Splunk Docs: Troubleshoot the KV Store

Splunk Docs: About the KV Store

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

View solution in original post

livehybrid
Super Champion

Hi @Mfmahdi 

Please do not tag/call out specific users on here - there are lots of people monitoring for questions being raised and those you have tagged do have day jobs and other priorities so you risk your question being missed.

To troubleshoot the KV Store initialization issue, start by examining the logs on the search head cluster members for specific errors.

| rest /services/kvstore/status
| fields splunk_server, current*

Then check on each SHC member:

ps -ef | grep mongod

# Check mongod logs for errors
tail -n 200 $SPLUNK_HOME/var/log/splunk/mongod.log

# Check splunkd logs for KV Store related errors
grep KVStore $SPLUNK_HOME/var/log/splunk/splunkd.log | tail -n 200

 

  1. Verify mongod Process: Ensure the mongod process, which underlies the KV Store, is running on the search head members. Use the ps command or your operating system's equivalent. If it's not running, investigate why using the logs.
  2. Check Cluster Health: Ensure the search head cluster itself is healthy using the Monitoring Console or the CLI command splunk show shcluster-status run from the captain. KV Store issues can sometimes be symptomatic of underlying cluster communication problems. From your screenshot it looks like this is showing as starting state, so hopefully the logs shine some light on the issue.
  3. Check Resources: Verify sufficient disk space, memory, and CPU resources on the search head cluster members, particularly on the node currently acting as the KV Store primary.

  4. Focus on the error messages found in mongod.log and splunkd.log as they usually pinpoint the root cause (e.g., permissions, disk space, configuration errors, corrupted files).

  5. If the logs indicate corruption or persistent startup failures that restarts don't resolve, you may need to consider more advanced recovery steps, potentially involving Splunk Support.

USeful docs which might help:

Splunk Docs: Troubleshoot the KV Store

Splunk Docs: About the KV Store

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Mfmahdi
Explorer

Hi  livehybrid,

Thank you for pointing that out to me.

Regarding your suggested solution. It was very helpful. I checked the mongod.log using the following command:

tail -n 200 $SPLUNK_HOME/var/log/splunk/mongod.log

The output clearly showed the issue:

2025-03-27T10:16:32.087Z W  NETWORK  [main] Server certificate has no compatible Subject Alternative Name. This may prevent TLS clients from connecting
2025-03-27T10:16:32.087Z F  NETWORK  [main] The provided SSL certificate is expired or not yet valid.
2025-03-27T10:16:32.087Z F  -        [main] Fatal Assertion 28652 at src/mongo/util/net/ssl_manager_openssl.cpp 1182
2025-03-27T10:16:32.087Z F  -        [main] 
***aborting after fassert() failure

It turned out that the server SSL certificate had expired. Here are the steps I took to resolve the issue:

1- Backed up the existing certificate:

cp $SPLUNK_HOME/etc/auth/server.pem $SPLUNK_HOME/etc/auth/server.pem.bak

2- Generated a new self-signed certificate:

splunk createssl server-cert -d $SPLUNK_HOME/etc/auth -n server

(This creates a new server.pem valid for 2 years.) 

3- restart splunk

./splunk restart

4- Verified KV Store status: 

splunk show kvstore-status

 

#####Note for Search Head Cluster####

Since we’re running a SH cluster, I made sure to:

  • Copy the new server.pem to all search head members.

  • Restart Splunk on each node.

These steps fully resolved the issue, and the KV Store is now functioning as expected.

 

0 Karma
Get Updates on the Splunk Community!

Splunk App Dev Community Updates – What’s New and What’s Next

Welcome to your go-to roundup of everything happening in the Splunk App Dev Community! Whether you're building ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco + Splunk! We’ve ...

Enterprise Security Content Update (ESCU) | New Releases

In April, the Splunk Threat Research Team had 2 releases of new security content via the Enterprise Security ...