My Splunk instance crashed and it won't restart with this error in splunkd.log
:
01-30-2013 18:29:05.094 +0000 WARN loader - Sleep 10 sec, waiting for config lock.
01-30-2013 18:29:15.094 +0000 WARN loader - Sleep 10 sec, waiting for config lock.
01-30-2013 18:29:25.094 +0000 WARN loader - Sleep 10 sec, waiting for config lock.
01-30-2013 18:29:35.094 +0000 WARN loader - Sleep 10 sec, waiting for config lock.
01-30-2013 18:29:45.094 +0000 WARN loader - Sleep 10 sec, waiting for config lock.
01-30-2013 18:29:55.094 +0000 WARN loader - Sleep 10 sec, waiting for config lock.
01-30-2013 18:30:05.094 +0000 FATAL loader - Timed out waiting for config lock; see splunkd_stderr.log for details. Exiting.
It's probably a locked file; to remove it:
./splunk clean locks
*.pid
or *.lock
) in $SPLUNK_HOME\var\run\splunk\
splunkd.log
In our case (6.2 Enterprise)
tail -f splunkd.log
suggested there was a lock on mongo.db
rm /opt/splunk/var/mongo/mongod.lock
did the job allowed the httpd process to start.
For this "lock", it is actually a file called $SPLUNK_HOME/var/run/splunk/conf-mutator.pid
, which really means something like "conf-modifying-program.pid".
Starting in 5.0, splunk.exe commands that modify the conf files determine whether splunkd is running or not, and ask running splunkd to modify the files for them if it is up. If it is not up, the splunk.exe command line program makes the changes itself. This file exists to synchronize the two possibilities, so we don't have files being edited by two programs at once.
The trouble comes in when something crashes, and leaves a stale conf-mutator.pid
file lying around. If there is no program running anymore with that process number, it should cause no trouble (though in very early versions of 5.0.x we would detect threads with that number). Alternatively you can have a problem where that process number is running, but it's not splunk.exe or splunkd.exe. This latter problem is fixed in Splunk 6.1.3 and later.
I had a different error message, but the solution was the same:
Operation "ospath_fopen" failed in /opt/splunk/p4/splunk/branches/5.0.2/src/libzero/conf-mutator-locking.c:254, conf_mutator_lock(); Permission denied
Operation "ospath_fopen" failed in /opt/splunk/p4/splunk/branches/5.0.2/src/libzero/conf-mutator-locking.c:319, conf_mutator_unlock(); Permission denied
It's probably a locked file; to remove it:
./splunk clean locks
*.pid
or *.lock
) in $SPLUNK_HOME\var\run\splunk\
splunkd.log
Very belatedly, the so-called "config lock" is $SPLUNK_HOME/var/run/splunk/conf-mutator.pid.
Simple version: 6.2.x+ should only show this behavior when it is correct for it to occur.
This message can occur correctly if you somehow try to run two splunkd programs at once, or if you have a splunk program running which is changing conf files while trying to start splunkd.
In versions before 6.1.3, the checking for running-pid was sloppy and would consider non-splunk programs to be valid owners of the pid.
If you are on such an old version, or if you see this on a new version and believe the behavior is working incorrectly:
If not, it is safe to delete
works for linux and windows
you can use command ./splunk clean locks as an alternative to step2.
yep, I had a one of those in the folder, once removed it started.