Based on other discussions around systemd, i created the unit file below. This seemed to work... Until the latest RHEL patch my security dept deployed. Now if splunk is restarted internally (via REST end point, or via deployment server app push), the service dies. Restart does not complete.
The problem appears to be with RemainAfterExit. If I use RemainAfterExit=yes
, this problem goes away, but systemd loses track of splunk. It has no idea the status of the splunk service. This is better than a dead splunk service, but could still be problematic. Any ideas on a proper systemd unit file for splunk? Will we ever get an official version that will be put into place with enable boot-start
?
[Unit]
Description=Splunk Forwarder
After=network.target
Wants=network.target
[Service]
Type=forking
RemainAfterExit=no
User=root
LimitNOFILE=12000
ExecStart=/app/splunkforwarder/bin/splunk restart --accept-license --answer-yes --no-prompt
ExecStop=/app/splunkforwarder/bin/splunk stop
RestartSec=20
Restart=on-failure
PIDFile=/app/splunkforwarder/var/run/splunk/splunkd.pid
[Install]
WantedBy=multi-user.target
Splunk has a systemd service file for Splunk in the WLM config. But, in order for it to work with a non-root user, you will possibly also need to modify polkit configs. My changes are below -- we run as userid splunk and the splunk daemon is named the default Splunkd (in $SPLUNK_HOME/etc/splunk-launch.conf):
// place in /etc/polkit-1/rules.d/80-splunk-service.rules
// polkitd should reload automatically
polkit.addRule(function(action, subject) {
//var debug = true;
if (action.id == "org.freedesktop.systemd1.manage-units" &&
action.lookup("unit") == "Splunkd.service" &&
subject.user == "splunk") {
return polkit.Result.YES;
}
});
Hi, twinspop . we have a problem with generate Splunkd.service automatically. my unit file is there:
#This unit file replaces the traditional start-up script for systemd
#configurations, and is used when enabling boot-start for Splunk on
#systemd-based Linux distributions.
[Unit]
Description=Systemd service file for Splunk, generated by 'splunk enable boot-start'
After=network.target
[Service]
Type=simple
Restart=always
ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd
LimitNOFILE=65536
SuccessExitStatus=51 52
RestartPreventExitStatus=51
RestartForceExitStatus=52
User=splunk
Delegate=true
KillMode=mixed
KillSignal=SIGINT
TimeoutStopSec=10min
CPUShares=1024
MemoryLimit=100G
PermissionsStartOnly=true
ExecStartPost=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/cpu/init.scope/system.slice/%n"
ExecStartPost=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/memory/init.scope/system.slice/%n"
[Install]
WantedBy=multi-user.target
can you help me?
@star_gh: maybe, but we would need more info on what the problem is. Fair warning, I'm no systemd expert.
This is very helpful, thanks for sharing this.
Side note: you really should not run Splunk as the root user.
Yeah, i know. All my indexers, SHs, DSs, etc run as userid splunk. Forwarders are tougher. Log files all over the place, with varying ownership and perms. Every one of the 400 different silo'd groups here has their own idea of what logs should look like, where they should be stored, etc. It's messy... But really that's just an excuse. Bottom line is, you're right. One day soon I'll decide to tackle the job of converting 8000+ forwarders to non-root.
You have my sincerest condolences.
You have my sincerest condolonces.
The community consensus for using systemd unit files with Splunk is: don't use systemd unit files with Splunk—especially with index clusters.
https://answers.splunk.com/answers/59662/is-there-a-systemd-unit-file-for-splunk.html works for most cases but, due to systemd still acting as a moving target and a lack of systemd-specific support built in to Splunk, I don't recommend using systemd at all. Install the SysV compatibility package and stick with the SysV init scripts.
Not an option here. I am required to use systemd.
Systemd does not work with Indexers. It may seem to be working, but it will fail you.
This for UF only! Not guaranteed to always work
RemainAfterExit
is by default no, and it should remain that way. Otherwise this may cause other problems when you have automation tools like chef puppet etc.
When you add PIDFile=/app/splunkforwarder/var/run/splunk/splunkd.pid systemd
systemd should technically pick up the PID from that file if it tries to start Splunk when its already running. This will make sure it remains active even after it restarts with a external call.
The missing piece in your unit file, is that it needs Restart=always
, not Restart=on-failure
. The reason is, that on-failure will only restart the service when the exit status is not 0
, and when Splunk is restarted manually, it stops the process gracefully, which will cause the service to be in a failed status and will not restart.
One of the problems I observed systemd has with Indexers or HF, is that they are running kvstore, and that process may not get killed in some cases when Splunk restarts or crashes, and if you have PIDFile=/app/splunkforwarder/var/run/splunk/splunkd.pid
in the unit file, systemd will think that Splunk is still alive, because the PID of the kvstore process will remain in the PID file.
There are a whole list of problems that still need to be addressed before Splunk and systemd can make peace. In the meantime, do what you gotta do, and deal with the issues as they come up.
There is a lot of information on systemd here:
service.service
Hmm @yorokobi -- Are you still implementing the same solution from 2012?