Hello.
Another great problem.
I tested the update on a clean install, 9.1.0 (empty, default) to 9.4.4, and all worked fine.
KVSTORE passed fom V4 to V7 and is running, by running step-by-step own upgrade.
Now i did the same update on a production instance, and there's no way to update KVSTORE 🤤
==> BEFORE THE UPDATE 9.1.x
This member:
backupRestoreStatus : Ready
date : xxx
dateSec : 1761289316.9
disabled : 0
featureCompatibilityVersion : 4.2
guid : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
oplogEndTimestamp : xxx
oplogEndTimestampSec : xxx
oplogStartTimestamp : xxx
oplogStartTimestampSec : xxx
port : 8191
replicaSet : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
replicationStatus : KV store captain
standalone : 1
status : ready
storageEngine : wiredTiger
KV store members:
127.0.0.1:8191
configVersion : 1
electionDate : xxx
electionDateSec : xxx
hostAndPort : 127.0.0.1:8191
optimeDate : xxx
optimeDateSec : xxx
replicationStatus : KV store captain
serverVersion : 4.2.17
uptime : 347
==> AFTER THE UPDATE TO 9.4.4
Splunk Enterprise 9.4 and higher no longer support KV store server version 4.2. Upgrade to KV store server version 7.0 for continued support and security, and to comply with Splunk Support Policy. See https://docs.splunk.com/Documentation/Splunk/latest/Admin/MigrateKVstore in the Admin manual to plan your upgrade.
This member:
backupRestoreStatus : Ready
disabled : 0
featureCompatibilityVersion : An error occurred during the last operation ('getParameter', domain: '15', code: '13053'): No suitable servers found: `serverSelectionTimeoutMS` expired: [Failed to read 4 bytes: socket error or timeout]
guid : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
port : 8191
standalone : 1
status : starting
storageEngine : wiredTiger
versionUpgradeInProgress : 0
Thanks.
Maybe... i say maybe...... i found a trick to make instance start with KVSTORE which starts and updates from v4 to v5>v6>v7.
Since i have many personal etc full of all my config, from many instances, i need them.
But the problem seems related to mongodb/Splunk certificates. So, i regenerate all of them from the start.
NETWORK [listener] connection accepted from 127.0.0.1:34164 #517 (1 connection now open)
NETWORK [conn517] SSL peer certificate validation failed: certificate has expired
NETWORK [ReplicaSetMonitor-TaskExecutor] SSL peer certificate validation failed: certificate has expired
NETWORK [conn517] Error receiving request from client: SSLHandshakeFailed: SSL peer certificate validation failed: certificate has expired. Ending connection from 127.0.0.1:34164 (connection id: 517)
NETWORK [conn517] end connection 127.0.0.1:34164 (0 connections now open)
So, i tried a
/opt/splunk/bin/splunk stop
/opt/splunk/etc/auth
rm -fr $(ls -1|egrep -v "^splunk.secret")
cd /opt
tar -xf splunk-9.4.5-8fb2a6c586a5-linux-amd64.tgz
/opt/splunk/bin/splunk restart --accept-license --answer-yes
/opt/splunk/bin/splunk show kvstore-status --verbose
Doing this on an instance blocked at
[Failed to read 4 bytes: socket error or timeout]
i finally was able to start instance and make KVSTORE starting and updating V4>>V7.
Strange.. very strange.... 🤔
No way to get rid of the error,
featureCompatibilityVersion : An error occurred during the last operation ('getParameter', domain: '15', code: '13053'): No suitable servers found: `serverSelectionTimeoutMS` expired: [Failed to read 4 bytes: socket error or timeout]After starting 9.4.4 / 9.4.5 over a 9.1.9 🤔
At this time i can only think about mongodb problems starting and updating, maybe hardware compatibility (avx says are supported 😐).
Any try to start the 9.4.x instance make the kvstore/mongodb blocked,
[Failed to read 4 bytes: socket error or timeout]
I stay at 9.3.6 version, can't pass over 😔
Maybe... i say maybe...... i found a trick to make instance start with KVSTORE which starts and updates from v4 to v5>v6>v7.
Since i have many personal etc full of all my config, from many instances, i need them.
But the problem seems related to mongodb/Splunk certificates. So, i regenerate all of them from the start.
NETWORK [listener] connection accepted from 127.0.0.1:34164 #517 (1 connection now open)
NETWORK [conn517] SSL peer certificate validation failed: certificate has expired
NETWORK [ReplicaSetMonitor-TaskExecutor] SSL peer certificate validation failed: certificate has expired
NETWORK [conn517] Error receiving request from client: SSLHandshakeFailed: SSL peer certificate validation failed: certificate has expired. Ending connection from 127.0.0.1:34164 (connection id: 517)
NETWORK [conn517] end connection 127.0.0.1:34164 (0 connections now open)
So, i tried a
/opt/splunk/bin/splunk stop
/opt/splunk/etc/auth
rm -fr $(ls -1|egrep -v "^splunk.secret")
cd /opt
tar -xf splunk-9.4.5-8fb2a6c586a5-linux-amd64.tgz
/opt/splunk/bin/splunk restart --accept-license --answer-yes
/opt/splunk/bin/splunk show kvstore-status --verbose
Doing this on an instance blocked at
[Failed to read 4 bytes: socket error or timeout]
i finally was able to start instance and make KVSTORE starting and updating V4>>V7.
Strange.. very strange.... 🤔
Playing with KVSTORE for a while, i found another method to make KVSTORE V7 to start 🤔
The mongodb error was different,
Failed to connect to target host: 127.0.0.1:8191
---
This member:
backupRestoreStatus : Ready
disabled : 0
featureCompatibilityVersion : An error occurred during the last operation ('getParameter', domain: '15', code: '13053'): No suitable servers found: `serverSelectionTimeoutMS` expired: [Failed to connect to target host: 127.0.0.1:8191]
guid : xxx
port : 8191
standalone : 1
status : failed
storageEngine : wiredTiger
versionUpgradeInProgress : 0
But, with this one, i got rid of it by,
- stop SPLUNK instance
- type a "splunk clean kvstore --local" (or "mv $SPLUNK_HOME/var/lib/splunk/kvstore/mongo/ $SPLUNK_HOME/var/lib/splunk/kvstore/mongo.old")
- restart the intance
Now,
This member:
backupRestoreStatus : Ready
date : x
dateSec : x
disabled : 0
featureCompatibilityVersion : 7.0
guid : xxx
oplogEndTimestamp : x
oplogEndTimestampSec : x
oplogStartTimestamp : x
oplogStartTimestampSec : x
port : 8191
replicaSet : xxx
replicationStatus : KV store captain
standalone : 1
status : ready
storageEngine : wiredTiger
versionUpgradeInProgress : 0
KV store members:
127.0.0.1:8191
configVersion : 1
electionDate : x
electionDateSec : x
hostAndPort : 127.0.0.1:8191
optimeDate : x
optimeDateSec : x
replicationStatus : KV store captain
serverVersion : 7.0.14
uptime : 22
I got rid of the issue, but it was a temp trick, by
- from original 9.1.x: delete completely the $SPLUNK_HOME/etc/auth
- update original 9.1.x instance to 9.4.4
- fix all the crypted SSL/pass4SymmKey keys in server.conf (and other config if necessary)
But it's not a good way to reach the objective!!! 🤔
Maybe something changed from versions, do not really know. Or i have some cert not ok for KVSTORE!!! 🤷as far as i know i read about this errors from many people updating from 9.x to 9.4.x!!!
Has anybody else here the same problem?
How to clean update from a 9.1.x to a 9.4.x? I also tried 9.1>9.2>9.3->9.4, NO WAY!!!
FYI
I also tried to install a clean 9.4.4 instance, and run it. All fine.
After copying my old etc configs,
This member:
backupRestoreStatus : Ready
disabled : 0
featureCompatibilityVersion : An error occurred during the last operation ('getParameter', domain: '15', code: '13053'): No suitable servers found: `serverSelectionTimeoutMS` expired: [Failed to read 4 bytes: socket error or timeout]
guid : xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
port : 8191
standalone : 1
status : starting
storageEngine : wiredTiger
versionUpgradeInProgress : 0