<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Systemd unit with pid tracking for Splunk in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277566#M190406</link>
    <description>&lt;P&gt;You should click &lt;CODE&gt;Accept&lt;/CODE&gt; on this answer to close the question.&lt;/P&gt;</description>
    <pubDate>Sat, 25 Mar 2017 12:18:56 GMT</pubDate>
    <dc:creator>woodcock</dc:creator>
    <dc:date>2017-03-25T12:18:56Z</dc:date>
    <item>
      <title>Systemd unit with pid tracking for Splunk</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277562#M190402</link>
      <description>&lt;P&gt;With a simple systemd unit file you can tell systemd how to start and stop a Splunk instance, but if the Splunk instance is restarted outside of the systemd process (due to a cluster bundle push or a simple /opt/splunk/bin/splunk restart for example) it will fall out of management with systemd (systemctl status splunk will not return up to date information on the process).&lt;/P&gt;

&lt;P&gt;This can lead to issues with management software like the inbuilt systemd watchdog or chef/puppet falsely believing the core splunkd process is down.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jun 2016 16:22:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277562#M190402</guid>
      <dc:creator>mwirth</dc:creator>
      <dc:date>2016-06-01T16:22:14Z</dc:date>
    </item>
    <item>
      <title>Re: Systemd unit with pid tracking for Splunk</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277563#M190403</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;  [Unit]
    Description=Splunk
    After=network.service
    Wants=network.service

    [Service]
    Type=forking
    User=splunk
    Group=splunk
    TimeoutSec=200
    RemainAfterExit=yes
    PIDFile=/opt/splunk/var/run/splunk/conf-mutator.pid
    ExecStart=/opt/splunk/bin/splunk start --answer-yes --no-prompt --accept-license
    ExecStop=/opt/splunk/bin/splunk stop
    ExecReload=/opt/splunk/bin/splunk restart
    StandardOutput=null
    LimitNOFILE=65536

    [Install]
    WantedBy=multi-user.target
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;EDIT:&lt;BR /&gt;
At first I thought this unit file with RemainAfterExit and PIDFile populated resolved the problem, however with further testing and studying the systemd documentation I've found it to be ineffective.&lt;BR /&gt;
Due to the way systemd handles process execution (systemctl-&amp;gt;cgroup-&amp;gt;process), restarting the splunk service without using systemctl commands will drop the process out of management no matter if you set the PID file or not.&lt;/P&gt;

&lt;P&gt;Right now I only see two options when running splunk through systemd unit files; &lt;/P&gt;

&lt;P&gt;1) Run the unit file with RemainAfterExit=yes. This forces systemd to mark the process as active even after the tracked splunkd PID has exited. Unfortunately, this also means that if Splunk crashes the process is still marked as healthy.&lt;/P&gt;

&lt;P&gt;2) Run the unit file without RemainAfterExit=yes (defaults to no). This means that if systemd sees the root splunkd process exit (even if it soon after restarts) it marks the service as down. This of course doesn't play nice with watchdog/puppet/chef etc.&lt;/P&gt;

&lt;P&gt;To my understanding, for this to be resolved either systemd or Splunk would have to make significant codebase changes.&lt;/P&gt;

&lt;P&gt;Even using the sysvinit compat layer (the default on RHEL7 installs where splunk enable boot-start is run) causes the same issue where the splunkd process restarting, stopping, or crashing causes systemd to loose track of the process state, marking it as "active (exited)" (seems to be using RemainAfterExit=yes like my unit file). I'm stumped.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jun 2016 16:24:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277563#M190403</guid>
      <dc:creator>mwirth</dc:creator>
      <dc:date>2016-06-01T16:24:52Z</dc:date>
    </item>
    <item>
      <title>Re: Systemd unit with pid tracking for Splunk</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277564#M190404</link>
      <description>&lt;P&gt;In a cluster setup it is actually even worse: without RemainAfterExit=yes the servers will just shutdown and never come up again when you trigger a rolling restart (manually or as part of a bundle push). Clean exit code (0), no error or warning from splunkd. Systemd thinks the shutdown was intentional (active, exited).&lt;/P&gt;

&lt;P&gt;We are running Splunk 6.3.3.&lt;/P&gt;</description>
      <pubDate>Tue, 06 Sep 2016 10:52:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277564#M190404</guid>
      <dc:creator>dschregenberger</dc:creator>
      <dc:date>2016-09-06T10:52:02Z</dc:date>
    </item>
    <item>
      <title>Re: Systemd unit with pid tracking for Splunk</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277565#M190405</link>
      <description>&lt;P&gt;I was struggling with the same issue you have. We use puppet, and &lt;CODE&gt;systemctl is-active splunk.service&lt;/CODE&gt; returned active when RemainAfterExit=yes was set, even if splunk has crashed, causing puppet not to restart it.&lt;/P&gt;

&lt;P&gt;The solution seems to be as follows. Do not set RemainAfterExit=yes, so it will actually become inactive when splunk restarts after a rolling-restart. But to prevent puppet from messing things up when it tries to restart the process, add this to the unit file: &lt;CODE&gt;PIDFile=/opt/splunk/var/run/splunk/splunkd.pid&lt;/CODE&gt;, this will cause systemd to start tracking the newly created PID of the already running process from the splunk.pid file when puppet issues &lt;CODE&gt;systemctl restart splunk.service&lt;/CODE&gt;, without it trying to actually restart it.&lt;/P&gt;

&lt;P&gt;The other thing I added, is &lt;CODE&gt;Restart=on-failure&lt;/CODE&gt;, this will cause splunk to start when the PID exited with a non 0 exit status (e.g. pkill splunk or crash).&lt;/P&gt;

&lt;P&gt;Here is my Unit file:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[Unit]
Description=Splunk indexer service
Wants=network.target
After=network.target
Requires=thp-disable.service

[Service]
Type=forking

Restart=on-failure
ExecStart=/opt/splunk/bin/splunk start
ExecStop=/opt/splunk/bin/splunk stop
ExecReload=/opt/splunk/bin/splunk restart
StandardOutput=syslog
LimitNOFILE=65535
LimitNPROC=16384
TimeoutSec=300
PIDFile=/opt/splunk/var/run/splunk/splunkd.pid

[Install]
WantedBy=multi-user.target
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 15 Nov 2016 17:26:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277565#M190405</guid>
      <dc:creator>rabbidroid</dc:creator>
      <dc:date>2016-11-15T17:26:44Z</dc:date>
    </item>
    <item>
      <title>Re: Systemd unit with pid tracking for Splunk</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277566#M190406</link>
      <description>&lt;P&gt;You should click &lt;CODE&gt;Accept&lt;/CODE&gt; on this answer to close the question.&lt;/P&gt;</description>
      <pubDate>Sat, 25 Mar 2017 12:18:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277566#M190406</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-03-25T12:18:56Z</dc:date>
    </item>
    <item>
      <title>Re: Systemd unit with pid tracking for Splunk</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277567#M190407</link>
      <description>&lt;P&gt;Instead of systemd monitor splunks pid file in forking mode, you can launch splunk direcly under systemd under simple mode, this will launch splunk as a service directly under systemd. You must be a bit careful though, once you use this, then you should only manage splunk start/stop via systemctl commands, otherwise if there is a restart lets say from UI, then splunk will start itself and also systemd will try to start it again and you may run into a race condition.&lt;/P&gt;

&lt;P&gt;However I would recommend reconfiguring boot-start, that way splunk knows if its running under systemd, if you, more specifically if you issue splunk restart, splunk will know its running under systemd and internally call systemctl start &lt;/P&gt;

&lt;P&gt;[Service]&lt;BR /&gt;
Type=simple&lt;BR /&gt;
Restart=always&lt;BR /&gt;
ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd&lt;BR /&gt;
LimitNOFILE=65536&lt;BR /&gt;
SuccessExitStatus=51 52&lt;BR /&gt;
RestartPreventExitStatus=51&lt;BR /&gt;
RestartForceExitStatus=52&lt;BR /&gt;
User=&lt;BR /&gt;
Delegate=true&lt;BR /&gt;
MemoryLimit=100G&lt;BR /&gt;
CPUShares=1024&lt;BR /&gt;
PermissionsStartOnly=true&lt;BR /&gt;
ExecStartPost=/bin/bash -c "chown -R : /sys/fs/cgroup/cpu/system.slice/%n"&lt;BR /&gt;
ExecStartPost=/bin/bash -c "chown -R : /sys/fs/cgroup/memory/system.slice/%n"&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:07:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Systemd-unit-with-pid-tracking-for-Splunk/m-p/277567#M190407</guid>
      <dc:creator>dimrirahul</dc:creator>
      <dc:date>2020-09-29T23:07:37Z</dc:date>
    </item>
  </channel>
</rss>

