Installation

Splunk server error and failed to start pid_check.sh to validate splunkd.pid

performancemoni
Path Finder

Hello,

We are having an issue, we have uninstalled a previous Splunk instance on our linux server and now we are trying to re-install Splunk (version 7.3.3). After starting Splunk for the first time with "splunk start" there doesn't seem to be any error at first, we manage to go to the login web page but when we login we receive a 500 server error.

And when we check the status of Splunk with "splunk status" we get the following message:

 

 

Error encountered, failed to start pid_check.sh to validate splunkd.pid. errno=11
Failed to determine if splunkd 5292 was running.
Can't run "btool web list settings --no-log": Resource temporarily unavailable

 

 

Do you know how to solve the problem ? Or how to investigate further ?

Here are some logs from splunkd.log:

 

 

08-24-2020 14:57:23.353 +0200 WARN  ProcessTracker - executable=splunk-optimize failed to start reason='': Resource temporarily unavailable
08-24-2020 15:00:00.003 +0200 INFO  ExecProcessor - setting reschedule_ms=3599997, for command=python /data/splunk/etc/apps/splunk_instrumentation/bin/instrumentation.py
08-24-2020 15:08:17.884 +0200 WARN  Thread - webui: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 62 threads active
((same logs on repeat...))
08-24-2020 15:10:18.355 +0200 INFO  ProcessTracker - Start process: type=SplunkOptimize idx=_internal procId=199 latency_sec=118.000
08-24-2020 15:10:18.357 +0200 INFO  ProcessTracker - Start process: type=SplunkOptimize idx=_audit procId=200 latency_sec=116.000
08-24-2020 15:10:18.357 +0200 WARN  ProcessTracker - executable=splunk-optimize failed to start reason='': Resource temporarily unavailable
08-24-2020 15:10:19.355 +0200 INFO  ProcessTracker - Start process: type=SplunkOptimize idx=_introspection procId=201 latency_sec=116.999
08-24-2020 15:17:00.654 +0200 ERROR SearchProcessRunner - preforked search=0/3 on process=0/3 caught exception.  completed_searches=0, process_started_ago=0.029, search_started_ago=0.028, search_ended_ago=0.000, total_usage_time=0.028
08-24-2020 15:17:00.654 +0200 ERROR SearchProcessRunner - preforked process=0/3 died on exception: Main Thread: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 4 threads active
08-24-2020 15:23:44.592 +0200 ERROR SearchProcessRunner - Error reading from preforked process=0/5: Connection reset by peer
08-24-2020 15:23:44.607 +0200 ERROR SearchProcessRunner - preforked search=0/4 on process=0/4 caught exception.  completed_searches=0, process_started_ago=0.029, search_started_ago=0.027, search_ended_ago=0.000, total_usage_time=0.027
08-24-2020 15:23:44.607 +0200 ERROR SearchProcessRunner - preforked process=0/4 died on exception: Main Thread: about to throw a ThreadException: pthread_create: Resource temporarily unavailable; 4 threads active

 

 

It seems like there is problem with the threads maybe ? We have never encountered that before.

Thank you vey much !

Labels (5)
0 Karma
1 Solution

performancemoni
Path Finder

We managed to fix the issue by changing the ulimits in /etc/security/limits.conf:

* hard core 0
* hard maxlogins 10
* soft nofile 65535
* hard nofile 65535
* soft nproc 20480
* hard nproc 20480
* soft fsize unlimited
* hard fsize unlimited

Then we stopped splunk, exited our SSH session, and after reconnecting via SSH and restarting Splunk it works like a charm, no more issues. So the problem was that our ulimits were not properly configured.

View solution in original post

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

When you said "uninstalled" what this actually means and what was the old version? Are you also removed all splunk related files which are not part of rpm etc?

Basically this could mean that you have just remove rpm, but left all other files e.g. SPLUNK_HOME/var/.../splunk.pid and when you are trying to install it again installer confused as there are some files left but not all?

r. Ismo

0 Karma

performancemoni
Path Finder

Splunk 7.3.3 was installed on the machine, we had to uninstall it and re-install it (same version). And yes we made sure to: stop splunk, uninstall it with rpm and remove everything from $SPLUNK_HOME afterwards, so there was no file left...

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Weird.

This has worked for me w/o any issues.

systemctl stop splunk OR su - splunk -c "/opt/splunk/bin/splunk stop -f"
yum remove -y splunk && rm -fr /opt/splunk
yum install -y ./splunk......

 You have stopped first running splunk processes? And there haven't been anything under /opt/splunk especially var/run...

r. Ismo

0 Karma

performancemoni
Path Finder

We managed to fix the issue by changing the ulimits in /etc/security/limits.conf:

* hard core 0
* hard maxlogins 10
* soft nofile 65535
* hard nofile 65535
* soft nproc 20480
* hard nproc 20480
* soft fsize unlimited
* hard fsize unlimited

Then we stopped splunk, exited our SSH session, and after reconnecting via SSH and restarting Splunk it works like a charm, no more issues. So the problem was that our ulimits were not properly configured.

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...