Hi,
I downloaded Splunk version 7.3.0 (build 657388c7a488) and installed it via the deb file onto a clean install of Debian 10.1
I subsequently followed the "Configure systemd on a clean install" instructions (https://docs.splunk.com/Documentation/Splunk/7.3.1/Admin/RunSplunkassystemdservice)
However running
sudo $SPLUNK_HOME/bin/splunk start
Yields (and same result if I "su -" to root instead of sudo)
Splunk> Needle. Haystack. Found.
Checking prerequisites...
Checking http port [8000]: open
Checking mgmt port [8089]: open
Checking appserver port [127.0.0.1:8065]: open
Checking kvstore port [8191]: open
Checking configuration... Done.
Checking critical directories... Done
Checking indexes...
Validated: _audit _internal _introspection _telemetry _thefishbucket history main summary
Done
Checking filesystem compatibility... Done
Checking conf files for problems...
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunk/splunk-7.3.0-657388c7a488-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.
Starting splunk server daemon (splunkd)...
Job for Splunkd.service failed because the control process exited with error code.
See "systemctl status Splunkd.service" and "journalctl -xe" for details.
Systemd manages the Splunk service. Use 'systemctl start Splunkd' to start the service. Root permission is required. Login as root user or use sudo.
# systemctl status Splunkd.service
● Splunkd.service - Systemd service file for Splunk, generated by 'splunk enable boot-start'
Loaded: loaded (/etc/systemd/system/Splunkd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2019-09-16 19:05:05 BST; 1min 4s ago
Process: 1655 ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd (code=killed, signal=TERM)
Process: 1656 ExecStartPost=/bin/bash -c chown -R 1001:1001 /sys/fs/cgroup/cpu/init.scope/system.slice/Splunkd.service (code=exited, status=1/FAILURE)
Main PID: 1655 (code=killed, signal=TERM)
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Service RestartSec=100ms expired, scheduling restart.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Scheduled restart job, restart counter is at 5.
Sep 16 19:05:05 spl systemd[1]: Stopped Systemd service file for Splunk, generated by 'splunk enable boot-start'.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Start request repeated too quickly.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Failed with result 'exit-code'.
Sep 16 19:05:05 spl systemd[1]: Failed to start Systemd service file for Splunk, generated by 'splunk enable boot-start'.
--
-- The process' exit code is 'killed' and its exit status is 15.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit Splunkd.service has entered the 'failed' state with result 'exit-code'.
Sep 16 19:05:05 spl systemd[1]: Failed to start Systemd service file for Splunk, generated by 'splunk enable boot-start'.
-- Subject: A start job for unit Splunkd.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit Splunkd.service has finished with a failure.
--
-- The job identifier is 2899 and the job result is failed.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Service RestartSec=100ms expired, scheduling restart.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Scheduled restart job, restart counter is at 5.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Automatic restarting of the unit Splunkd.service has been scheduled, as the result for
-- the configured Restart= setting for the unit.
Sep 16 19:05:05 spl systemd[1]: Stopped Systemd service file for Splunk, generated by 'splunk enable boot-start'.
-- Subject: A stop job for unit Splunkd.service has finished
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A stop job for unit Splunkd.service has finished.
--
-- The job identifier is 2975 and the job result is done.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Start request repeated too quickly.
Sep 16 19:05:05 spl systemd[1]: Splunkd.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- The unit Splunkd.service has entered the 'failed' state with result 'exit-code'.
Sep 16 19:05:05 spl systemd[1]: Failed to start Systemd service file for Splunk, generated by 'splunk enable boot-start'.
-- Subject: A start job for unit Splunkd.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit Splunkd.service has finished with a failure.
--
-- The job identifier is 2975 and the job result is failed.
FYI, found my own answer...
It seems the Splunk systemd installer script is a bit dumb (for lack of a better word).
Apparently Splunk developers don't see it fit to figure out the correct cgroup location for a given system.
So instead of "splunk enable boot-start -systemd-managed" checking the Splunk developer's choice of location and then raising an exception (or give you the choice to input manually) if it can't find it, instead the script just installs anyway and then leaves it to you to figure out why.
I can't say I'm impressed by either :
(a) The behaviour of "splunk enable boot-start -systemd-managed"
(b) The poor error handling by Splunk if cgroup is incorrect (I mean seriously, all it had to do throw an error saying cgroup not found !)
I apologise for the tone of the message, but frankly this problem took up far too many hours of my time yesterday.
This post led me to the /etc/systemd/system/SplunkForwarder.service file. Except now, on version 9.1.0.1 , this line:
ExecStartPost=/bin/bash -c "chown -R <userid>:<groupid> /sys/fs/cgroup/system.slice/%n"
already exists. Since all objects under /sys/fs/cgroup is owned by root:root, I added the user `splunkfwd` to the `adm` group then rebooted. Then I can use systemd to start/stop splunkforwarder.
This can be SELinux blocking this service, check this answer
Summary of the issue:
Splunk 6.0.0 - Splunk 7.2.1 defaults to using init.d when enabling boot start
Splunk 7.2.2 - Splunk 7.2.9 defaults to using systemd when enabling boot start
Splunk 7.3.0 - Splunk 8.x defaults to using init.d when enabling boot start
systemd defaults to prompting for root credentials upon stop/start/restart of Splunk
Here is a simple fix if you have encountered this issue and prefer to use the traditional init.d scripts vs systemd.
Splunk Enterprise/Heavy Forwarder example (note: replace the splunk user below with the account you run splunk as):
sudo /opt/splunk/bin/splunk disable boot-start
sudo /opt/splunk/bin/splunk enable boot-start -user splunk -systemd-managed 0
Splunk Universal Forwarder example (note: replace the splunk user below with the account you run splunk as):
sudo /opt/splunkforwarder/bin/splunk disable boot-start
sudo /opt/splunkforwarder/bin/splunk enable boot-start -user splunk -systemd-managed 0
I guess Splunk 9.x defauls to systemd again. Any way to revert to init.d?
FYI, found my own answer...
It seems the Splunk systemd installer script is a bit dumb (for lack of a better word).
Apparently Splunk developers don't see it fit to figure out the correct cgroup location for a given system.
So instead of "splunk enable boot-start -systemd-managed" checking the Splunk developer's choice of location and then raising an exception (or give you the choice to input manually) if it can't find it, instead the script just installs anyway and then leaves it to you to figure out why.
I can't say I'm impressed by either :
(a) The behaviour of "splunk enable boot-start -systemd-managed"
(b) The poor error handling by Splunk if cgroup is incorrect (I mean seriously, all it had to do throw an error saying cgroup not found !)
I apologise for the tone of the message, but frankly this problem took up far too many hours of my time yesterday.
What was the solution? What should others do to avoid or fix this problem?
@kundeng @richgalloway
The workaround:
Scroll almost all the way to the bottom of https://docs.splunk.com/Documentation/Splunk/7.3.1/Admin/RunSplunkassystemdservice. Find the little blue comment box that talks about cgroups and think about what it means for your installation (i.e. what Splunk decided the cgroup should be is probably not the location on your system .... so go edit the systemd unit file that Splunk installed).
The solution:
As I made clear the real solution is....
Splunk need to write better software that (a) doesn't make hardcoded assumptoins about locations of files on systems (b) has better error handling that provides useful messages instead of failing without reason
What was the solution?
For me it was removing /init.scope from splunkd.service
ExecStartPost=/bin/bash -c "chown -R : /sys/fs/cgroup/cpu/init.scope/system.slice/%n"
ExecStartPost=/bin/bash -c "chown -R : /sys/fs/cgroup/memory/init.scope/system.slice/%n"
Hi,
Do you know the impact of that ?
It worked for me as well, but I'd like to better understand what I'm really doing.
Thanks!
Ema
One of the error messages says Use 'systemctl start Splunkd' to start the service
, but I don't see where you tried that.
@richgalloway
Yup, I tried that too.
Ultimately I found the solution, buried in an obscure comment deep in the manual.
It seems the Splunk systemd installer script is a bit dumb (for lack of a better word).
Apparently Splunk developers don't see it fit to figure out the correct cgroup location for a given system.
So instead of "splunk enable boot-start -systemd-managed" checking the Splunk developer's choice of location and then raising an exception (or give you the choice to input manually) if it can't find it, instead the script just installs anyway and then leaves it to you to figure out why.
I can't say I'm impressed by either :
(a) The behaviour of "splunk enable boot-start -systemd-managed"
(b) The poor error handling by Splunk if cgroup is incorrect (I mean seriously, all it had to do throw an error saying cgroup not found !)
I apologise for the tone of the message, but frankly this problem took up far too many hours of my time yesterday.
Here is what fixed it for me -
Original /etc/systemd/system/Splunkd.service (working under Ubuntu 20.04 LTS):
ExecStartPost=/bin/bash -c "chown -R <userid>:<groupid> /sys/fs/cgroup/cpu/system.slice/%n" ExecStartPost=/bin/bash -c "chown -R <userid>:<groupid> /sys/fs/cgroup/memory/system.slice/%n"
On Ubuntu 22.04 LTS, the cgroup does not include cpu or memory in the path, so I modified /etc/systemd/system/Splunkd.service is as follows:
ExecStartPost=/bin/bash -c "chown -R <userid>:<groupid> /sys/fs/cgroup/system.slice/%n"
I ran as root and used: sudo find /sys/fs/cgroup -name "*Splunk*" -print
Please post your solution as an answer and accept it to help future readers.
Will do. Thanks for dropping by !