Security

Splunk systemd unit file in versions 7.2.2 and newer - how do I stop this prompting for the root password? (Q&A)

gjanders
SplunkTrust
SplunkTrust

As per the various other systemd related answers posts:
Is there a systemd unit file for Splunk?
Is there a systemd unit file for Splunk?
Splunk 7.2.2 - systemd - Root privileges required when starting/stopping Splunk?

There has been a lot of confusion since Splunk added the systemd start/stop option in Splunk 7.2.2
However systemd is required for the use of workload management so what are the options?

Duane's blogpost on Splunk 7.2.2 and systemd, does an excellent job of summarising the scenario and solutions.

So the question is, how do we get splunk to stop/start with the systemd unit file, without a modern version of polkit (which does not exist on any Redhat 7.x version), and without using the systemctl stop/start commands? In other words I want splunk stop/start to work as they did before the systemd unit file was in use...

1 Solution

gjanders
SplunkTrust
SplunkTrust

On Duane's blogpost Splunk 7.2.2 and systemd, if you refer to "Better Polkit Changes", this is what I found was required on Redhat 7.5/7.6

The steps I used were:

splunk enable boot-start -user splunk

This creates a systemd unit file, I found it creates either Splunkd.service or a splunkd.service in the /etc/systemd/system/ directory

By default this unit file (as of Splunk 7.2.5) will result in Splunk been killed on shutdown/stop of Splunk, if you add these additional lines under the [Service] section of the unit file (credit to splunk support for this suggestion):

# Send $KillSignal only to main (splunkd) process, if any of the child processes is still alive after $TimeoutStopSec, SIGKILL them.
KillMode=mixed
# Splunk doesn't shutdown gracefully on SIGTERM
KillSignal=SIGINT

Then Splunk will shutdown correctly when stopping, if used with either systemctl stop Splunkd or systemctl stop splunkd depending on which Splunk version created the unit file.
If you want to run splunk stop you will need to create two files, a polkit rule file (Duane's github has an example)

polkit.addRule(function(action, subject) { 
    if (action.id == "org.freedesktop.systemd1.manage-units" 
        && subject.user == "splunk") { 
        try { 
            polkit.spawn(["/usr/local/bin/polkit_splunk", ""+subject.pid]); 
            return polkit.Result.YES; 
        } 
        catch (error) { 
            return polkit.Result.AUTH_ADMIN; 
        } 
    }
});

This file will exist in /etc/polkit-1/rules.d/ , if you are running an OS with systemd 226 or newer you could alternatively use the "Polkit changes" on the blogpost as suggested by twinspop, if you are using Redhat 7.x please use the above.

In addition to the polkit file you will need to create the file /usr/local/bin/polkit_splunk (available in github)

The code will be:

#!/bin/bash -x
COMM=($(ps --no-headers -o cmd -p $1))

if [[ "${COMM[1]}" == "start" ]] || 
   [[ "${COMM[1]}" == "stop"  ]] || 
   [[ "${COMM[1]}" == "restart" ]]; then

      if [[ "${COMM[2]}" == "Splunkd" ]] ||
         [[ "${COMM[2]}" == "Splunkd.service" ]]; then
          exit 0
      fi
fi

exit 1

Note you may need to change "Splunkd" with "splunkd" depending on your unit file name (which will match the $SPLUNK_HOME/etc/splunk-launch.conf SPLUNK_SERVER_NAME setting, you will also need to ensure execute permissions on the above:

chmod  755 /usr/local/bin/polkit_splunk

Once this is in place the splunk stop/start command works fine as the splunk user...

Furthermore the additional of KillMode=mixed, KillSignal=SIGINT means that splunk stop does not result in the splunk process been killed on shutdown.

Finally, for anyone using init.d on Redhat 7.4 or newer you may wish to test the following scenario:

  • Let splunk start as part of the OS
  • Run splunk stop, splunk start from the command line
  • Reboot the server
  • Check the splunkd.log file to see if the shutdown occurred as expected, or look for the "dirty" message when splunk is starting up again.

On Oracle Linux 7.4/7.5 (based on Redhat 7.4/7.5) the above results in Splunk terminating on shutdown, the systemd unit files resolve this issue!

In addition to the above a suggestion from xpac is:

# Give Splunk time to shutdown - especially busy indexers can take time
TimeoutStopSec=10min

At least according to the documentation the default stop wait period appears to be 90 seconds before the SIGKILL is sent to the process, 10 minutes is a more reasonable time for a busy Splunk process to stop

Finally, in newer versions of systemd there is a new setting called TasksMax, this setting defaults to 512 in some systemd versions and therefore will need to be increased within the systemd unit file for Splunk, refer to the systemd documentation for more information.

You can also set the ulimit settings within the systemd unit file to ensure they correctly apply on OS startup, here is an example including the TasksMax set to unlimited:

LimitCORE=0
LimitDATA=infinity
LimitNICE=0
LimitFSIZE=infinity
LimitSIGPENDING=385952
LimitMEMLOCK=65536
LimitRSS=infinity
LimitMSGQUEUE=819200
LimitRTPRIO=0
LimitSTACK=infinity
LimitCPU=infinity
LimitAS=infinity
LimitLOCKS=infinity
LimitNOFILE=1024000
LimitNPROC=512000
TasksMax=infinity

An additional note, for anyone upgrading from an older systemd-enabled Splunk UF or Enterprise server to Splunk 8 or newer please see Run Splunk Enterprise as a systemd service in particular:

If you configured Splunk Enterprise version 7.3.x or earlier to run as a systemd service, upon upgrade to version 8.0.0, on initial start, Splunk Enterprise modifies the existing systemd configuration as follows:
It removes the ExecStartPost and User properties from the Splunkd.service unit file.
It checks the systemd environment, identifies the cgroup path, and automatically sets permissions for the correct cgroup directories.

You can either update your systemd unit file or let Splunk attempt to do it for you (sudo splunk may work)

View solution in original post

nightowl
Loves-to-Learn

Has anyone found a good solution for resolving this issue on ubuntu systems?

0 Karma

isoutamo
SplunkTrust
SplunkTrust
I suppose that, if you have enough fresh version of systemd this should work on any Linux?
r. Ismo
0 Karma

bandit
Motivator

Summary of the issue:
Splunk 6.0.0 - Splunk 7.2.1 defaults to using init.d when enabling boot start
Splunk 7.2.2 - Splunk 7.2.9 defaults to using systemd when enabling boot start
Splunk 7.3.0 - Splunk 8.x defaults to using init.d when enabling boot start

systemd defaults to prompting for root credentials upon stop/start/restart of Splunk

Here is a simple fix if you have encountered this issue and prefer to use the traditional init.d scripts vs systemd.

Splunk Enterprise/Heavy Forwarder example (note: replace the splunk user below with the account you run splunk as):

sudo /opt/splunk/bin/splunk disable boot-start
sudo /opt/splunk/bin/splunk enable boot-start -user splunk -systemd-managed 0

Splunk Universal Forwarder example (note: replace the splunk user below with the account you run splunk as):

sudo /opt/splunkforwarder/bin/splunk disable boot-start
sudo /opt/splunkforwarder/bin/splunk enable boot-start -user splunk -systemd-managed 0
0 Karma

chrisyounger
SplunkTrust
SplunkTrust

Very nice writeup and excellent investigations. I think its also worth mentioning (for anyone else that finds this) that if you don't want to use SystemD you can use the old initd method still by using the flag: -systemd-managed 0 to the boot-start command. More info here: https://docs.splunk.com/Documentation/Splunk/latest/Admin/RunSplunkassystemdservice#Additional_optio...

chrisyounger
SplunkTrust
SplunkTrust

Done thanks

0 Karma

gjanders
SplunkTrust
SplunkTrust

On Duane's blogpost Splunk 7.2.2 and systemd, if you refer to "Better Polkit Changes", this is what I found was required on Redhat 7.5/7.6

The steps I used were:

splunk enable boot-start -user splunk

This creates a systemd unit file, I found it creates either Splunkd.service or a splunkd.service in the /etc/systemd/system/ directory

By default this unit file (as of Splunk 7.2.5) will result in Splunk been killed on shutdown/stop of Splunk, if you add these additional lines under the [Service] section of the unit file (credit to splunk support for this suggestion):

# Send $KillSignal only to main (splunkd) process, if any of the child processes is still alive after $TimeoutStopSec, SIGKILL them.
KillMode=mixed
# Splunk doesn't shutdown gracefully on SIGTERM
KillSignal=SIGINT

Then Splunk will shutdown correctly when stopping, if used with either systemctl stop Splunkd or systemctl stop splunkd depending on which Splunk version created the unit file.
If you want to run splunk stop you will need to create two files, a polkit rule file (Duane's github has an example)

polkit.addRule(function(action, subject) { 
    if (action.id == "org.freedesktop.systemd1.manage-units" 
        && subject.user == "splunk") { 
        try { 
            polkit.spawn(["/usr/local/bin/polkit_splunk", ""+subject.pid]); 
            return polkit.Result.YES; 
        } 
        catch (error) { 
            return polkit.Result.AUTH_ADMIN; 
        } 
    }
});

This file will exist in /etc/polkit-1/rules.d/ , if you are running an OS with systemd 226 or newer you could alternatively use the "Polkit changes" on the blogpost as suggested by twinspop, if you are using Redhat 7.x please use the above.

In addition to the polkit file you will need to create the file /usr/local/bin/polkit_splunk (available in github)

The code will be:

#!/bin/bash -x
COMM=($(ps --no-headers -o cmd -p $1))

if [[ "${COMM[1]}" == "start" ]] || 
   [[ "${COMM[1]}" == "stop"  ]] || 
   [[ "${COMM[1]}" == "restart" ]]; then

      if [[ "${COMM[2]}" == "Splunkd" ]] ||
         [[ "${COMM[2]}" == "Splunkd.service" ]]; then
          exit 0
      fi
fi

exit 1

Note you may need to change "Splunkd" with "splunkd" depending on your unit file name (which will match the $SPLUNK_HOME/etc/splunk-launch.conf SPLUNK_SERVER_NAME setting, you will also need to ensure execute permissions on the above:

chmod  755 /usr/local/bin/polkit_splunk

Once this is in place the splunk stop/start command works fine as the splunk user...

Furthermore the additional of KillMode=mixed, KillSignal=SIGINT means that splunk stop does not result in the splunk process been killed on shutdown.

Finally, for anyone using init.d on Redhat 7.4 or newer you may wish to test the following scenario:

  • Let splunk start as part of the OS
  • Run splunk stop, splunk start from the command line
  • Reboot the server
  • Check the splunkd.log file to see if the shutdown occurred as expected, or look for the "dirty" message when splunk is starting up again.

On Oracle Linux 7.4/7.5 (based on Redhat 7.4/7.5) the above results in Splunk terminating on shutdown, the systemd unit files resolve this issue!

In addition to the above a suggestion from xpac is:

# Give Splunk time to shutdown - especially busy indexers can take time
TimeoutStopSec=10min

At least according to the documentation the default stop wait period appears to be 90 seconds before the SIGKILL is sent to the process, 10 minutes is a more reasonable time for a busy Splunk process to stop

Finally, in newer versions of systemd there is a new setting called TasksMax, this setting defaults to 512 in some systemd versions and therefore will need to be increased within the systemd unit file for Splunk, refer to the systemd documentation for more information.

You can also set the ulimit settings within the systemd unit file to ensure they correctly apply on OS startup, here is an example including the TasksMax set to unlimited:

LimitCORE=0
LimitDATA=infinity
LimitNICE=0
LimitFSIZE=infinity
LimitSIGPENDING=385952
LimitMEMLOCK=65536
LimitRSS=infinity
LimitMSGQUEUE=819200
LimitRTPRIO=0
LimitSTACK=infinity
LimitCPU=infinity
LimitAS=infinity
LimitLOCKS=infinity
LimitNOFILE=1024000
LimitNPROC=512000
TasksMax=infinity

An additional note, for anyone upgrading from an older systemd-enabled Splunk UF or Enterprise server to Splunk 8 or newer please see Run Splunk Enterprise as a systemd service in particular:

If you configured Splunk Enterprise version 7.3.x or earlier to run as a systemd service, upon upgrade to version 8.0.0, on initial start, Splunk Enterprise modifies the existing systemd configuration as follows:
It removes the ExecStartPost and User properties from the Splunkd.service unit file.
It checks the systemd environment, identifies the cgroup path, and automatically sets permissions for the correct cgroup directories.

You can either update your systemd unit file or let Splunk attempt to do it for you (sudo splunk may work)

esalesap
Path Finder

If all this is known, why doesn't Splunk add it to what it does when you enable boot-start in the first place?

0 Karma

vgollapudi
Communicator

Worked like charm!!

FYI: make sure the file created under /etc/polkit/rules.d/{} is having permissions 644.

Get Updates on the Splunk Community!

Video | Welcome Back to Smartness, Pedro

Remember Splunk Community member, Pedro Borges? If you tuned into Episode 2 of our Smartness interview series, ...

Detector Best Practices: Static Thresholds

Introduction In observability monitoring, static thresholds are used to monitor fixed, known values within ...

Expert Tips from Splunk Education, Observability in Action, Plus More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...