Hello to everyone,
on my indexers I just configured Splunk as a service with systemd, start command works fine but stop command (systemctl stop Splunkd), instead, returns some errors:
[root@pe-sec-idx-02 system]# systemctl status Splunkd
● Splunkd.service - Systemd service file for Splunk, generated by 'splunk enable boot-start'
Loaded: loaded (/etc/systemd/system/Splunkd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2022-01-12 15:18:32 CET; 9s ago
Process: 1462 ExecStop=/opt/splunk/bin/splunk _internal_launch_under_systemd (code=exited, status=1/FAILURE)
Process: 31484 ExecStop=/bin/sleep 10 (code=exited, status=0/SUCCESS)
Process: 31225 ExecStop=/sbin/runuser -l splunk -c /opt/splunk/bin/splunk edit cluster-config -manual_detention on -auth admin:D1c3mbr3Sec (code=exited, status=0/SUCCESS)
Process: 20750 ExecStartPost=/bin/bash -c chown -R 1001:1001 /sys/fs/cgroup/memory/system.slice/%n (code=exited, status=0/SUCCESS)
Process: 20746 ExecStartPost=/bin/bash -c chown -R 1001:1001 /sys/fs/cgroup/cpu/system.slice/%n (code=exited, status=0/SUCCESS)
Process: 20643 ExecStartPost=/sbin/runuser -l splunk -c /opt/splunk/bin/splunk edit cluster-config -manual_detention off -auth admin:D1c3mbr3Sec (code=exited, status=0/SUCCESS)
Process: 19156 ExecStartPost=/bin/sleep 60 (code=exited, status=0/SUCCESS)
Process: 19155 ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd (code=exited, status=52)
Main PID: 19155 (code=exited, status=52)
Jan 12 15:12:20 pe-sec-idx-02 splunk[19155]: All installed files intact.
Jan 12 15:12:20 pe-sec-idx-02 splunk[19155]: Done
Jan 12 15:12:21 pe-sec-idx-02 splunk[19155]: Checking replication_port port [9887]: 2022-01-12 15:12:21.354 +0100 splunkd started (build 7651b7244cf2)
Jan 12 15:13:17 pe-sec-idx-02 systemd[1]: Started Systemd service file for Splunk, generated by 'splunk enable boot-start'.
Jan 12 15:18:03 pe-sec-idx-02 systemd[1]: Stopping Systemd service file for Splunk, generated by 'splunk enable boot-start'...
Jan 12 15:18:16 pe-sec-idx-02 systemd[1]: Splunkd.service: control process exited, code=exited status=1
Jan 12 15:18:16 pe-sec-idx-02 splunk[19155]: 2022-01-12 15:18:16.021 +0100 Interrupt signal received
Jan 12 15:18:34 pe-sec-idx-02 systemd[1]: Stopped Systemd service file for Splunk, generated by 'splunk enable boot-start'.
Jan 12 15:18:34 pe-sec-idx-02 systemd[1]: Unit Splunkd.service entered failed state.
Jan 12 15:18:34 pe-sec-idx-02 systemd[1]: Splunkd.service failed.
Despite the output, service stops successfully.
As you can see, I added some instructions in the service unit file to put the indexer (which is part of a cluster) in manual detention before stopping it, and also it turns manual detention off once Splunk is started. I say again that stop/start commands work good, but in any case I get the above error messages when I stop the service.
Am I doing something wrong?
This is my service unit file:
#This unit file replaces the traditional start-up script for systemd
#configurations, and is used when enabling boot-start for Splunk on
#systemd-based Linux distributions.
[Unit]
Description=Systemd service file for Splunk, generated by 'splunk enable boot-start'
After=network.target
[Service]
Type=simple
Restart=always
ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd
ExecStartPost=/bin/sleep 60
ExecStartPost=/sbin/runuser -l splunk -c '/opt/splunk/bin/splunk edit cluster-config -manual_detention off -auth admin:D1c3mbr3Sec'
ExecStop=/sbin/runuser -l splunk -c '/opt/splunk/bin/splunk edit cluster-config -manual_detention on -auth admin:D1c3mbr3Sec'
ExecStop=/bin/sleep 10
ExecStop=/opt/splunk/bin/splunk _internal_launch_under_systemd
LimitNOFILE=64000
LimitNPROC=16000
SuccessExitStatus=51 52
RestartPreventExitStatus=51
RestartForceExitStatus=52
User=splunk
Delegate=true
CPUShares=1024
CPUQuota=1400%
MemoryLimit=30G
PermissionsStartOnly=true
ExecStartPost=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/cpu/system.slice/%n"
ExecStartPost=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/memory/system.slice/%n"
KillMode=mixed
KillSignal=SIGINT
TimeoutStopSec=10min
[Install]
WantedBy=multi-user.target
If you remove the manual detention command does the error still occur?
There's no need to put the indexer into detention because a stopped indexer can't receive data, anyway.
Actually removing manual detention commands service starts without any error or warning.
On the other hand, instead, once I start the service replication factor and search factor remain unsatisfied, while putting the node in manual detention before the service restart then replication and search factor are met.