Knowledge Management

After upgrading to 6.5.0, KV Store will not start

jcrabb_splunk
Splunk Employee
Splunk Employee

After upgrading to Splunk Enterprise 6.5.0, the KV Store will not start. On my indexers I see:

10/5/2016, 5:44:56 AM:

Search peer indexer01.domain.local has the following message: Failed to start KV Store process. See mongod.log and splunkd.log for details.

In splunkd.log I find:

10-05-2016 05:44:56.087 +0000 ERROR MongodRunner - mongod exited abnormally (exit code 14, status: exited with code 14) - look at mongod.log to investigate.

Looking in the mongod.log I find:

2016-10-05T05:44:56.753Z W CONTROL  No SSL certificate validation can be performed since no CA file has been provided; please specify an sslCAFile parameter
 2016-10-05T05:44:56.761Z F NETWORK  The provided SSL certificate is expired or not yet valid.
 2016-10-05T05:44:56.761Z I -        Fatal Assertion 28652
 2016-10-05T05:44:56.761Z I -
 ***aborting after fassert() failure

How can this be resolved?

Jacob
Sr. Technical Support Engineer
1 Solution

jcrabb_splunk
Splunk Employee
Splunk Employee

This can happen if the cert used by Splunkd to talk to Mongod has expired. Verify your certs are valid. For example, to validate the expiration date for server.pem you can run:

From $SPLUNK_HOME/etc/auth/

openssl x509 -enddate -noout -in ./server.pem

Results:

notAfter=Dec  10 14:017:25 2015 GMT

In the example above, the cert is expired. If you want to create a new cert you can look at splunk createssl:

$SPLUNK_HOME/bin/splunk help createssl

An example:

$SPLUNK_HOME/bin/splunk createssl server-cert -d $SPLUNK_HOME/etc/auth -n server -c cn.domain.com -l 2048

Simply adjust for your environment requirements/settings. Once the new cert is in place, you can test to confirm it is valid:

From $SPLUNK_HOME/etc/auth/

openssl x509 -enddate -noout -in ./server.pem

Results:

notAfter=Aug 22 15:30:45 2019 GMT

If it is now valid, restart Splunk and validate if KVStore is running:

ps -ax | grep mongod

26108 ?        Ssl   62:11 mongod --dbpath=/opt/splunk/var/lib/splunk/kvstore/mongo --port=8191 --timeStampFormat=iso8601-utc --smallfiles --oplogSize=200 --keyFile=/opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key --setParameter=enableLocalhostAuthBypass=0 --replSet=50D25A40-7DD2-4017-A223-732705AD4A96 --sslAllowInvalidHostnames --sslMode=preferSSL --sslPEMKeyFile=/opt/splunk/etc/auth/server.pem --sslPEMKeyPassword=xxxxxxxx --nounixsocket

And also:

$SPLUNK_HOME/bin/splunk _internal call /services/server/info |grep -i kvstore

<s:key name="kvStoreStatus">ready</s:key>

A third way to check is to run the following search from command line on the instance where you have the "Deployment Management Console / Monitor Console" Configured.

$SPLUNK_HOME/bin/splunk search "| rest /services/server/info splunk_server=* | fields splunk_server, kvStoreStatus"

          splunk_server       kvStoreStatus
    ------------------------- -------------
    indexer01.domain.com      ready
    indexer02.domain.com      ready
    indexer03.domain.com      ready
    indexer04.domain.com      ready
    indexer05.domain.com      ready
Jacob
Sr. Technical Support Engineer

View solution in original post

alemarzuTM
Explorer

Be careful.

Had this issue recently cos an expired certificates on ITSI server, but after creating a new one and restarting Splunk, all my objects in ITSI are gone (from services to entities), like a fresh ITSI installation.

This was on Splunk v7.2.4.2 + ITSI v4.1.2

0 Karma

Michael
Contributor

I corrected it with generating a new key:
/opt/splunk/bin/splunk createssl server-cert 3072 -d /opt/splunk/etc/auth/ -n server -c {your name here}

restart Splunk

went from:

tail mongod.log

... [main] The provided SSL certificate is expired or not yet valid.
... [main] Fatal Assertion 28652 at src/mongo/util/net/ssl_manager.cpp 1120

to

tail mongod.log

... Successfully authenticated as principal __system on local

0 Karma

ryanhast
Explorer

I ran into the issue with the KVstore not starting after upgrading my indexers form 6.4.7 to 6.6.4. Shout out to @sgarvin55 for his answer. In my server.conf on my indexers I had the following setting.
sslVersions = tls1.2, -ssl3, -ssl2, -ssl1

I removed that setting and restarted Splunk. KVstore started normally.
It appears the default is now tls1.2 for Splunk 6.6.3 and 6.6.4.

0 Karma

claudio_manig
Communicator

I had the same on a windows server with default certs- just stopped splunk, moved the default server.pem and start splunk. This forces cert gen of the default certs which solved the problem here.

sgarvin55
Splunk Employee
Splunk Employee

I wanted to add another gotcha I figured out after running through this answers post. It seems in prior versions to 6.5.2, specifically 6.5.0 and 6.5.1 as tested this worked, but now fails in 6.5.2. Customers had the following setting in their server.conf after upgrade:

[sslConfig]
sslVersions = "tls"

Customers whom upgraded to 6.5.2 found that KVSTORE wouldn't start, but splunkd.log didn't show any syntax issues. Only the mongod.log indicated something was wrong "unknown protocol".

splunkd.log
INFO loader - Server supporting SSL versions SSL3,TLS1.0,TLS1.1,TLS1.2

mongod.log
E NETWORK [conn893] SSL: error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol
D NETWORK [conn893] SocketException: remote: 127.0.0.1:37578 error: 9001 socket exception [CONNECT_ERROR]

Removing the double quotes from "tls" fixed the the above issue. So the correct syntax is now:

[sslConfig]
sslVersions = tls

A bug has been created - SPL-138443

bport15
Path Finder

What server.conf are you referring to? I'm not seeing this "sslVersion = "tls"" in the server.conf in /opt/splunk/etc/system/default or /opt/splunk/etc/system/local.

0 Karma

vinchakov_a
Path Finder

I have the same issue after update, but I get such errors in mongod.log:
2017-02-06T06:39:59.168Z E NETWORK [conn180] SSL: error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol
2017-02-06T06:39:59.168Z I NETWORK [conn180] end connection 127.0.0.1:33592 (0 connections now open)
2017-02-06T06:39:59.175Z I NETWORK [initandlisten] connection accepted from 127.0.0.1:33593 #181 (1 connection now open)
2017-02-06T06:39:59.175Z E NETWORK [conn181] SSL: error:140760FC:SSL routines:SSL23_GET_CLIENT_HELLO:unknown protocol
2017-02-06T06:39:59.175Z I NETWORK [conn181] end connection 127.0.0.1:33593 (0 connections now open)

0 Karma

sylim_splunk
Splunk Employee
Splunk Employee

If you think it's not about the expired cert issue check the permissions of kvstore file as below - I had the same
- Copy from ndoshi's comments below to have better visibility.
To be precise, the file is /opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key. Doing a chmod 400 splunk.key did the trick.

fab73
Path Finder

It worked for me, thanks

0 Karma

damode
Motivator

In my case, its neither certificate expired nor the persmission issue. Still getting the error, is there any other workaround ?

0 Karma

tmontney
Builder

I'd open a case with support. This thread was the first thing they linked me.

0 Karma

miteshp250283
Path Finder

I am facing the same error on freshly installed Splunk 6.5.0 on Windows 10 x64. My install is a standalone trial for exploring new features and does not have commercial Enterprise license or domain environment.

Any idea if this error has major implication or safe to ignore in this case?

Appreciate the help.

Regards, Mitesh.

0 Karma

jcrabb_splunk
Splunk Employee
Splunk Employee

This can happen if the cert used by Splunkd to talk to Mongod has expired. Verify your certs are valid. For example, to validate the expiration date for server.pem you can run:

From $SPLUNK_HOME/etc/auth/

openssl x509 -enddate -noout -in ./server.pem

Results:

notAfter=Dec  10 14:017:25 2015 GMT

In the example above, the cert is expired. If you want to create a new cert you can look at splunk createssl:

$SPLUNK_HOME/bin/splunk help createssl

An example:

$SPLUNK_HOME/bin/splunk createssl server-cert -d $SPLUNK_HOME/etc/auth -n server -c cn.domain.com -l 2048

Simply adjust for your environment requirements/settings. Once the new cert is in place, you can test to confirm it is valid:

From $SPLUNK_HOME/etc/auth/

openssl x509 -enddate -noout -in ./server.pem

Results:

notAfter=Aug 22 15:30:45 2019 GMT

If it is now valid, restart Splunk and validate if KVStore is running:

ps -ax | grep mongod

26108 ?        Ssl   62:11 mongod --dbpath=/opt/splunk/var/lib/splunk/kvstore/mongo --port=8191 --timeStampFormat=iso8601-utc --smallfiles --oplogSize=200 --keyFile=/opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key --setParameter=enableLocalhostAuthBypass=0 --replSet=50D25A40-7DD2-4017-A223-732705AD4A96 --sslAllowInvalidHostnames --sslMode=preferSSL --sslPEMKeyFile=/opt/splunk/etc/auth/server.pem --sslPEMKeyPassword=xxxxxxxx --nounixsocket

And also:

$SPLUNK_HOME/bin/splunk _internal call /services/server/info |grep -i kvstore

<s:key name="kvStoreStatus">ready</s:key>

A third way to check is to run the following search from command line on the instance where you have the "Deployment Management Console / Monitor Console" Configured.

$SPLUNK_HOME/bin/splunk search "| rest /services/server/info splunk_server=* | fields splunk_server, kvStoreStatus"

          splunk_server       kvStoreStatus
    ------------------------- -------------
    indexer01.domain.com      ready
    indexer02.domain.com      ready
    indexer03.domain.com      ready
    indexer04.domain.com      ready
    indexer05.domain.com      ready
Jacob
Sr. Technical Support Engineer

View solution in original post

peterkn
Explorer

I have followed up until I checked my cert and it is still valid:

[root@splunk auth]# openssl x509 -enddate -noout -in ./server.pem
notAfter=Apr  6 10:27:22 2023 GMT

Both checks failed

/opt/splunk/bin/splunk _internal call /services/server/info |grep -i kvstore
<s:key name="kvStoreStatus">failed</s:key>

/opt/splunk/bin/splunk search "| rest /services/server/info splunk_server=* | fields splunk_server, kvStoreStatus"
         splunk_server           kvStoreStatus
-------------------------------- -------------
splunk.****.com failed

Any help is greatly appreciated ....

0 Karma

damode
Motivator

This is it ! finally found the right solution. thanks so much!

0 Karma

tmontney
Builder

This was exactly my issue start to finish. If you're on Windows, openssl is the splunk/bin folder.

0 Karma

chrisfrigo
Path Finder

thanks for this!

0 Karma

jmaple_splunk
Splunk Employee
Splunk Employee

Another option is to change the name of the current server.pem file in $SPLUNK_HOME/etc/auth and restart Splunk. Splunk will generate a new certificate upon start-up.

matthew_hess
Engager

Thanks, worked like a champ!

0 Karma

benwilinski
New Member

This is the fix. Thank you.

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!