Hello,
We are having an issue, we have uninstalled a previous Splunk instance on our linux server and now we are trying to re-install Splunk (version 7.3.3). After starting Splunk for the first time with "splunk start" there doesn't seem to be any error at first, we manage to go to the login web page but when we login we receive a 500 server error.
And when we check the status of Splunk with "splunk status" we get the following message:
Error encountered, failed to start pid_check.sh to validate splunkd.pid. errno=11
Failed to determine if splunkd 5292 was running.
Can't run "btool web list settings --no-log": Resource temporarily unavailable
Do you know how to solve the problem ? Or how to investigate further ?
Here are some logs from splunkd.log:
08-24-2020 14:57:23.353 +0200 WARN ProcessTracker - executable=splunk-optimize failed to start reason='': Resource temporarily unavailable
08-24-2020 15:00:00.003 +0200 INFO ExecProcessor - setting reschedule_ms=3599997, for command=python /data/splunk/etc/apps/splunk_instrumentation/bin/instrumentation.py
08-24-2020 15:08:17.884 +0200 WARN Thread - webui: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 62 threads active
((same logs on repeat...))
08-24-2020 15:10:18.355 +0200 INFO ProcessTracker - Start process: type=SplunkOptimize idx=_internal procId=199 latency_sec=118.000
08-24-2020 15:10:18.357 +0200 INFO ProcessTracker - Start process: type=SplunkOptimize idx=_audit procId=200 latency_sec=116.000
08-24-2020 15:10:18.357 +0200 WARN ProcessTracker - executable=splunk-optimize failed to start reason='': Resource temporarily unavailable
08-24-2020 15:10:19.355 +0200 INFO ProcessTracker - Start process: type=SplunkOptimize idx=_introspection procId=201 latency_sec=116.999
08-24-2020 15:17:00.654 +0200 ERROR SearchProcessRunner - preforked search=0/3 on process=0/3 caught exception. completed_searches=0, process_started_ago=0.029, search_started_ago=0.028, search_ended_ago=0.000, total_usage_time=0.028
08-24-2020 15:17:00.654 +0200 ERROR SearchProcessRunner - preforked process=0/3 died on exception: Main Thread: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 4 threads active
08-24-2020 15:23:44.592 +0200 ERROR SearchProcessRunner - Error reading from preforked process=0/5: Connection reset by peer
08-24-2020 15:23:44.607 +0200 ERROR SearchProcessRunner - preforked search=0/4 on process=0/4 caught exception. completed_searches=0, process_started_ago=0.029, search_started_ago=0.027, search_ended_ago=0.000, total_usage_time=0.027
08-24-2020 15:23:44.607 +0200 ERROR SearchProcessRunner - preforked process=0/4 died on exception: Main Thread: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 4 threads active
It seems like there is problem with the threads maybe ? We have never encountered that before.
Thank you vey much !
We managed to fix the issue by changing the ulimits in /etc/security/limits.conf:
* hard core 0
* hard maxlogins 10
* soft nofile 65535
* hard nofile 65535
* soft nproc 20480
* hard nproc 20480
* soft fsize unlimited
* hard fsize unlimited
Then we stopped splunk, exited our SSH session, and after reconnecting via SSH and restarting Splunk it works like a charm, no more issues. So the problem was that our ulimits were not properly configured.
Hi
When you said "uninstalled" what this actually means and what was the old version? Are you also removed all splunk related files which are not part of rpm etc?
Basically this could mean that you have just remove rpm, but left all other files e.g. SPLUNK_HOME/var/.../splunk.pid and when you are trying to install it again installer confused as there are some files left but not all?
r. Ismo
Splunk 7.3.3 was installed on the machine, we had to uninstall it and re-install it (same version). And yes we made sure to: stop splunk, uninstall it with rpm and remove everything from $SPLUNK_HOME afterwards, so there was no file left...
Weird.
This has worked for me w/o any issues.
systemctl stop splunk OR su - splunk -c "/opt/splunk/bin/splunk stop -f"
yum remove -y splunk && rm -fr /opt/splunk
yum install -y ./splunk......
You have stopped first running splunk processes? And there haven't been anything under /opt/splunk especially var/run...
r. Ismo
We managed to fix the issue by changing the ulimits in /etc/security/limits.conf:
* hard core 0
* hard maxlogins 10
* soft nofile 65535
* hard nofile 65535
* soft nproc 20480
* hard nproc 20480
* soft fsize unlimited
* hard fsize unlimited
Then we stopped splunk, exited our SSH session, and after reconnecting via SSH and restarting Splunk it works like a charm, no more issues. So the problem was that our ulimits were not properly configured.