Systemd and manual detention

nicofantinato · ‎01-12-2022

Hello to everyone,

on my indexers I just configured Splunk as a service with systemd, start command works fine but stop command (systemctl stop Splunkd), instead, returns some errors:

[root@pe-sec-idx-02 system]# systemctl status Splunkd
● Splunkd.service - Systemd service file for Splunk, generated by 'splunk enable boot-start'
Loaded: loaded (/etc/systemd/system/Splunkd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2022-01-12 15:18:32 CET; 9s ago
Process: 1462 ExecStop=/opt/splunk/bin/splunk _internal_launch_under_systemd (code=exited, status=1/FAILURE)
Process: 31484 ExecStop=/bin/sleep 10 (code=exited, status=0/SUCCESS)
Process: 31225 ExecStop=/sbin/runuser -l splunk -c /opt/splunk/bin/splunk edit cluster-config -manual_detention on -auth admin:D1c3mbr3Sec (code=exited, status=0/SUCCESS)
Process: 20750 ExecStartPost=/bin/bash -c chown -R 1001:1001 /sys/fs/cgroup/memory/system.slice/%n (code=exited, status=0/SUCCESS)
Process: 20746 ExecStartPost=/bin/bash -c chown -R 1001:1001 /sys/fs/cgroup/cpu/system.slice/%n (code=exited, status=0/SUCCESS)
Process: 20643 ExecStartPost=/sbin/runuser -l splunk -c /opt/splunk/bin/splunk edit cluster-config -manual_detention off -auth admin:D1c3mbr3Sec (code=exited, status=0/SUCCESS)
Process: 19156 ExecStartPost=/bin/sleep 60 (code=exited, status=0/SUCCESS)
Process: 19155 ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd (code=exited, status=52)
Main PID: 19155 (code=exited, status=52)

Jan 12 15:12:20 pe-sec-idx-02 splunk[19155]: All installed files intact.
Jan 12 15:12:20 pe-sec-idx-02 splunk[19155]: Done
Jan 12 15:12:21 pe-sec-idx-02 splunk[19155]: Checking replication_port port [9887]: 2022-01-12 15:12:21.354 +0100 splunkd started (build 7651b7244cf2)
Jan 12 15:13:17 pe-sec-idx-02 systemd[1]: Started Systemd service file for Splunk, generated by 'splunk enable boot-start'.
Jan 12 15:18:03 pe-sec-idx-02 systemd[1]: Stopping Systemd service file for Splunk, generated by 'splunk enable boot-start'...
Jan 12 15:18:16 pe-sec-idx-02 systemd[1]: Splunkd.service: control process exited, code=exited status=1
Jan 12 15:18:16 pe-sec-idx-02 splunk[19155]: 2022-01-12 15:18:16.021 +0100 Interrupt signal received
Jan 12 15:18:34 pe-sec-idx-02 systemd[1]: Stopped Systemd service file for Splunk, generated by 'splunk enable boot-start'.
Jan 12 15:18:34 pe-sec-idx-02 systemd[1]: Unit Splunkd.service entered failed state.
Jan 12 15:18:34 pe-sec-idx-02 systemd[1]: Splunkd.service failed.

Despite the output, service stops successfully.

As you can see, I added some instructions in the service unit file to put the indexer (which is part of a cluster) in manual detention before stopping it, and also it turns manual detention off once Splunk is started. I say again that stop/start commands work good, but in any case I get the above error messages when I stop the service.

Am I doing something wrong?

This is my service unit file:

#This unit file replaces the traditional start-up script for systemd
#configurations, and is used when enabling boot-start for Splunk on
#systemd-based Linux distributions.

[Unit]
Description=Systemd service file for Splunk, generated by 'splunk enable boot-start'
After=network.target

[Service]
Type=simple
Restart=always
ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd
ExecStartPost=/bin/sleep 60
ExecStartPost=/sbin/runuser -l splunk -c '/opt/splunk/bin/splunk edit cluster-config -manual_detention off -auth admin:D1c3mbr3Sec'
ExecStop=/sbin/runuser -l splunk -c '/opt/splunk/bin/splunk edit cluster-config -manual_detention on -auth admin:D1c3mbr3Sec'
ExecStop=/bin/sleep 10
ExecStop=/opt/splunk/bin/splunk _internal_launch_under_systemd
LimitNOFILE=64000
LimitNPROC=16000
SuccessExitStatus=51 52
RestartPreventExitStatus=51
RestartForceExitStatus=52
User=splunk
Delegate=true
CPUShares=1024
CPUQuota=1400%
MemoryLimit=30G
PermissionsStartOnly=true
ExecStartPost=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/cpu/system.slice/%n"
ExecStartPost=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/memory/system.slice/%n"
KillMode=mixed
KillSignal=SIGINT
TimeoutStopSec=10min

[Install]
WantedBy=multi-user.target

richgalloway · ‎01-12-2022

If you remove the manual detention command does the error still occur?

There's no need to put the indexer into detention because a stopped indexer can't receive data, anyway.

---
If this reply helps you, Karma would be appreciated.

nicofantinato · ‎01-12-2022

Actually removing manual detention commands service starts without any error or warning.

On the other hand, instead, once I start the service replication factor and search factor remain unsatisfied, while putting the node in manual detention before the service restart then replication and search factor are met.

Systemd and manual detention

configuration

troubleshooting

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!