Deployment Architecture

6.2.1 : Search head cluster getting stuck on restart

Lucas_K
Motivator

I've noticed this on 2 different versions of 6.2 now.

When I do a apply bundle from a deploy instance the very first search head in the cluster to do its shutdown won't do it cleanly and requires manual intervention.

It appears that mongodb doesn't stop for some reason.

From mongod.log

2015-01-16T03:46:33.024Z [initandlisten] exception in initAndListen: 10310 Unable to lock file: /opt/splunk/var/lib/splunk/kvstore/mongo/mongod.lock. Is a mongod instance already running?, terminating
2015-01-16T03:46:33.024Z [initandlisten] dbexit:
2015-01-16T03:46:33.024Z [initandlisten] shutdown: going to close listening sockets...
2015-01-16T03:46:33.024Z [initandlisten] shutdown: going to flush diaglog...
2015-01-16T03:46:33.024Z [initandlisten] shutdown: going to close sockets...
2015-01-16T03:46:33.024Z [initandlisten] shutdown: waiting for fs preallocator...
2015-01-16T03:46:33.024Z [initandlisten] shutdown: lock for final commit...
2015-01-16T03:46:33.025Z [initandlisten] shutdown: final commit...
2015-01-16T03:46:33.025Z [initandlisten] shutdown: closing all files...
2015-01-16T03:46:33.025Z [initandlisten] closeAllFiles() finished
2015-01-16T03:46:33.025Z [initandlisten] dbexit: really exiting now

This would "seem" that its been stopped, but it hasn't.

Checking for the lock file I see that it is still there and contains a valid process id for a still "running" mongo process.

Killing it and manually restarting splunk resolves this.

0 Karma

mzorzi
Splunk Employee
Splunk Employee

Make sure that only Apps compatible with SHC are installed.

Any outcome from the ticket?

0 Karma

Lucas_K
Motivator

Yeah the devs have been able to replicate the issue.

There is a fix that will be included in a later release, version and date yet to be determined.

I've been offered to test it prior to release as its holding up a significant volume of other architecture work so hopefully it fixes my issue.

0 Karma

fabiocaldas
Contributor

Did you tried v6.2.2 to see if it was fixed there? I'm also having same issue

0 Karma

Lucas_K
Motivator

Support has told me it should be in the next release in less than a month.

0 Karma

gbacs
Explorer

has there been side effects of killing the mongodb process manually - search results not found etc, when the sh's come back up? I am in the process of implementing sh clustering.

0 Karma

Lucas_K
Motivator

I can replicate this pretty constantly now.

Replication factor of 3 with 5 members. At least 2 of these members will get stuck when an apply-bundle is performed.

I've guessing a ticket will need to be logged. 😞

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...