Installation

KVstore migration failed on some indexers after upgrade to 9.0.2?

jpillai
Path Finder

We recently upgraded our cluster from splunk 8.1.0.1 to splunk 9.0.2 and the KVstore on SH cluster were manually upgraded to WiredTiger.

We could see that cluster manager and peer nodes were automatically upgraded to WiredTiger mostly, however some indexer peers failed in this. Please find the related error messages from the mongodb.log below. 

 

Its not clear why exactly this happened. Is there a manual way to recover and migrate?

---------------
2023-01-12T08:44:16.890Z I STORAGE [initandlisten] exception in initAndListen: Location28662: Cannot start server. Detected data files in /usr/local/akamai/splunk/var/lib/splunk/kvstore/mongo created by the 'mmapv1' storage engine, but the specified storage engine was 'wiredTiger'., terminating
2023-01-12T08:44:16.890Z I REPL [initandlisten] Stepping down the ReplicationCoordinator for shutdown, waitTime: 10000ms
2023-01-12T08:44:16.890Z I NETWORK [initandlisten] shutdown: going to close listening sockets...
2023-01-12T08:44:16.890Z I NETWORK [initandlisten] Shutting down the global connection pool
2023-01-12T08:44:16.890Z I - [initandlisten] Killing all operations for shutdown
2023-01-12T08:44:16.890Z I NETWORK [initandlisten] Shutting down the ReplicaSetMonitor
2023-01-12T08:44:16.890Z I CONTROL [initandlisten] Shutting down free monitoring
2023-01-12T08:44:16.890Z I FTDC [initandlisten] Shutting down full-time data capture
2023-01-12T08:44:16.890Z I STORAGE [initandlisten] Shutting down the HealthLog
2023-01-12T08:44:16.890Z I - [initandlisten] Dropping the scope cache for shutdown
2023-01-12T08:44:16.890Z I CONTROL [initandlisten] now exiting
2023-01-12T08:44:16.890Z I CONTROL [initandlisten] shutting down with code:100
---------------

Cluster details – splunk multisite
-------------------------
4 SH (site1)
4 SH (site2)
11 IDX (site1)
11 IDX (site2)
Master1(Site1)
Master2(Standby at site2)

Labels (3)
0 Karma

Gregski11
Contributor

please correct me if I am wrong, but I thought per Splunks documentation recommendation they recommend we run the KV Store on Search Heads and not on Peers aka Indexers, I know this does not solve your problem, but maybe it doesn't have to, maybe just run the KV Store on the Search Heads   

0 Karma

jpillai
Path Finder

Hey, as far as I understand -  we could choose to disable KVstore on non-SH machines, but there is nothing really stopping us from running it on other machines. They will be single instance KVstores and is enabled by default on all instances.

In my case, it worked fine on most indexer peers and its just 3 out of 20 indexers that has this issue.

0 Karma

Gregski11
Contributor

"In my case, it worked fine on most indexer peers and its just 3 out of 20 indexers that has this issue." 

Ah yes, that sounds like Splunk, ha ha 

here's what I would do:

stop Splunk
delete everything from the splunkd.log
start Splunk
open up the newly populated splunkd.log

start at the top and take your time going through it line by line just quickly glancing what it is doing until it gets to the KVStore part, my money is on a certificate issue 

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...