After upgrading to Splunk Enterprise 6.5.0, the KV Store will not start. On my indexers I see:
10/5/2016, 5:44:56 AM:
Search peer indexer01.domain.local has the following message: Failed to start KV Store process. See mongod.log and splunkd.log for details.
In splunkd.log I find:
10-05-2016 05:44:56.087 +0000 ERROR MongodRunner - mongod exited abnormally (exit code 14, status: exited with code 14) - look at mongod.log to investigate.
Looking in the mongod.log I find:
2016-10-05T05:44:56.753Z W CONTROL No SSL certificate validation can be performed since no CA file has been provided; please specify an sslCAFile parameter
2016-10-05T05:44:56.761Z F NETWORK The provided SSL certificate is expired or not yet valid.
2016-10-05T05:44:56.761Z I - Fatal Assertion 28652
2016-10-05T05:44:56.761Z I -
***aborting after fassert() failure
How can this be resolved?
This can happen if the cert used by Splunkd to talk to Mongod has expired. Verify your certs are valid. For example, to validate the expiration date for server.pem you can run:
From $SPLUNK_HOME/etc/auth/
openssl x509 -enddate -noout -in ./server.pem
Results:
notAfter=Dec 10 14:017:25 2015 GMT
In the example above, the cert is expired. If you want to create a new cert you can look at splunk createssl:
$SPLUNK_HOME/bin/splunk help createssl
An example:
$SPLUNK_HOME/bin/splunk createssl server-cert -d $SPLUNK_HOME/etc/auth -n server -c cn.domain.com -l 2048
Simply adjust for your environment requirements/settings. Once the new cert is in place, you can test to confirm it is valid:
From $SPLUNK_HOME/etc/auth/
openssl x509 -enddate -noout -in ./server.pem
Results:
notAfter=Aug 22 15:30:45 2019 GMT
If it is now valid, restart Splunk and validate if KVStore is running:
ps -ax | grep mongod
26108 ? Ssl 62:11 mongod --dbpath=/opt/splunk/var/lib/splunk/kvstore/mongo --port=8191 --timeStampFormat=iso8601-utc --smallfiles --oplogSize=200 --keyFile=/opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key --setParameter=enableLocalhostAuthBypass=0 --replSet=50D25A40-7DD2-4017-A223-732705AD4A96 --sslAllowInvalidHostnames --sslMode=preferSSL --sslPEMKeyFile=/opt/splunk/etc/auth/server.pem --sslPEMKeyPassword=xxxxxxxx --nounixsocket
And also:
$SPLUNK_HOME/bin/splunk _internal call /services/server/info |grep -i kvstore
<s:key name="kvStoreStatus">ready</s:key>
A third way to check is to run the following search from command line on the instance where you have the "Deployment Management Console / Monitor Console" Configured.
$SPLUNK_HOME/bin/splunk search "| rest /services/server/info splunk_server=* | fields splunk_server, kvStoreStatus"
splunk_server kvStoreStatus
------------------------- -------------
indexer01.domain.com ready
indexer02.domain.com ready
indexer03.domain.com ready
indexer04.domain.com ready
indexer05.domain.com ready
Be careful.
Had this issue recently cos an expired certificates on ITSI server, but after creating a new one and restarting Splunk, all my objects in ITSI are gone (from services to entities), like a fresh ITSI installation.
This was on Splunk v7.2.4.2 + ITSI v4.1.2
I corrected it with generating a new key:
/opt/splunk/bin/splunk createssl server-cert 3072 -d /opt/splunk/etc/auth/ -n server -c {your name here}
restart Splunk
went from:
... [main] The provided SSL certificate is expired or not yet valid.
... [main] Fatal Assertion 28652 at src/mongo/util/net/ssl_manager.cpp 1120
to
... Successfully authenticated as principal __system on local
I ran into the issue with the KVstore not starting after upgrading my indexers form 6.4.7 to 6.6.4. Shout out to @sgarvin55 for his answer. In my server.conf on my indexers I had the following setting.
sslVersions = tls1.2, -ssl3, -ssl2, -ssl1
I removed that setting and restarted Splunk. KVstore started normally.
It appears the default is now tls1.2 for Splunk 6.6.3 and 6.6.4.
I had the same on a windows server with default certs- just stopped splunk, moved the default server.pem and start splunk. This forces cert gen of the default certs which solved the problem here.
I tried stopping Splunk, moved the default server.pem out, started Splunk. It created a new server.pem but kvstore still shows as failed. Running Splunk version 8.1 on Windows Server 2016. Any ideas to resolve kvstore failure issue?
I wanted to add another gotcha I figured out after running through this answers post. It seems in prior versions to 6.5.2, specifically 6.5.0 and 6.5.1 as tested this worked, but now fails in 6.5.2. Customers had the following setting in their server.conf after upgrade:
[sslConfig]
sslVersions = "tls"
Customers whom upgraded to 6.5.2 found that KVSTORE wouldn't start, but splunkd.log didn't show any syntax issues. Only the mongod.log indicated something was wrong "unknown protocol".
splunkd.log
INFO loader - Server supporting SSL versions SSL3,TLS1.0,TLS1.1,TLS1.2
mongod.log
E NETWORK [conn893] SSL: error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol
D NETWORK [conn893] SocketException: remote: 127.0.0.1:37578 error: 9001 socket exception [CONNECT_ERROR]
Removing the double quotes from "tls" fixed the the above issue. So the correct syntax is now:
[sslConfig]
sslVersions = tls
A bug has been created - SPL-138443
What server.conf are you referring to? I'm not seeing this "sslVersion = "tls"" in the server.conf in /opt/splunk/etc/system/default or /opt/splunk/etc/system/local.
I have the same issue after update, but I get such errors in mongod.log:
2017-02-06T06:39:59.168Z E NETWORK [conn180] SSL: error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol
2017-02-06T06:39:59.168Z I NETWORK [conn180] end connection 127.0.0.1:33592 (0 connections now open)
2017-02-06T06:39:59.175Z I NETWORK [initandlisten] connection accepted from 127.0.0.1:33593 #181 (1 connection now open)
2017-02-06T06:39:59.175Z E NETWORK [conn181] SSL: error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol
2017-02-06T06:39:59.175Z I NETWORK [conn181] end connection 127.0.0.1:33593 (0 connections now open)
If you think it's not about the expired cert issue check the permissions of kvstore file as below - I had the same
- Copy from ndoshi's comments below to have better visibility.
To be precise, the file is /opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key. Doing a chmod 400 splunk.key did the trick.
It worked for me, thanks
In my case, its neither certificate expired nor the persmission issue. Still getting the error, is there any other workaround ?
I'd open a case with support. This thread was the first thing they linked me.
I am facing the same error on freshly installed Splunk 6.5.0 on Windows 10 x64. My install is a standalone trial for exploring new features and does not have commercial Enterprise license or domain environment.
Any idea if this error has major implication or safe to ignore in this case?
Appreciate the help.
Regards, Mitesh.
This can happen if the cert used by Splunkd to talk to Mongod has expired. Verify your certs are valid. For example, to validate the expiration date for server.pem you can run:
From $SPLUNK_HOME/etc/auth/
openssl x509 -enddate -noout -in ./server.pem
Results:
notAfter=Dec 10 14:017:25 2015 GMT
In the example above, the cert is expired. If you want to create a new cert you can look at splunk createssl:
$SPLUNK_HOME/bin/splunk help createssl
An example:
$SPLUNK_HOME/bin/splunk createssl server-cert -d $SPLUNK_HOME/etc/auth -n server -c cn.domain.com -l 2048
Simply adjust for your environment requirements/settings. Once the new cert is in place, you can test to confirm it is valid:
From $SPLUNK_HOME/etc/auth/
openssl x509 -enddate -noout -in ./server.pem
Results:
notAfter=Aug 22 15:30:45 2019 GMT
If it is now valid, restart Splunk and validate if KVStore is running:
ps -ax | grep mongod
26108 ? Ssl 62:11 mongod --dbpath=/opt/splunk/var/lib/splunk/kvstore/mongo --port=8191 --timeStampFormat=iso8601-utc --smallfiles --oplogSize=200 --keyFile=/opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key --setParameter=enableLocalhostAuthBypass=0 --replSet=50D25A40-7DD2-4017-A223-732705AD4A96 --sslAllowInvalidHostnames --sslMode=preferSSL --sslPEMKeyFile=/opt/splunk/etc/auth/server.pem --sslPEMKeyPassword=xxxxxxxx --nounixsocket
And also:
$SPLUNK_HOME/bin/splunk _internal call /services/server/info |grep -i kvstore
<s:key name="kvStoreStatus">ready</s:key>
A third way to check is to run the following search from command line on the instance where you have the "Deployment Management Console / Monitor Console" Configured.
$SPLUNK_HOME/bin/splunk search "| rest /services/server/info splunk_server=* | fields splunk_server, kvStoreStatus"
splunk_server kvStoreStatus
------------------------- -------------
indexer01.domain.com ready
indexer02.domain.com ready
indexer03.domain.com ready
indexer04.domain.com ready
indexer05.domain.com ready
I have followed up until I checked my cert and it is still valid:
[root@splunk auth]# openssl x509 -enddate -noout -in ./server.pem
notAfter=Apr 6 10:27:22 2023 GMT
Both checks failed
/opt/splunk/bin/splunk _internal call /services/server/info |grep -i kvstore
<s:key name="kvStoreStatus">failed</s:key>
/opt/splunk/bin/splunk search "| rest /services/server/info splunk_server=* | fields splunk_server, kvStoreStatus"
splunk_server kvStoreStatus
-------------------------------- -------------
splunk.****.com failed
Any help is greatly appreciated ....
This is it ! finally found the right solution. thanks so much!
This was exactly my issue start to finish. If you're on Windows, openssl is the splunk/bin folder.
thanks for this!
Another option is to change the name of the current server.pem file in $SPLUNK_HOME/etc/auth and restart Splunk. Splunk will generate a new certificate upon start-up.
This worked for me on version 8.x recently. Thanks.