Installation

Centos 9 Stream: Systemd Service File for Splunkd Altered After Manual Start.

triptraptresko
Explorer

Summary:

On a CentOS Stream 9 system, after installing Splunk in /opt/splunk and configuring it to start on boot with systemd, I've noticed unusual behavior. Using manual Splunk commands (/opt/splunk/bin/splunk [start | stop | restart]) alters the Splunkd.service file in /etc/systemd/system/, creating a timestamped backup. This change prevents Splunk from starting using systemctl commands and consequently on boot, defeating the purpose of the systemd setup. Using chattr to make the service file immutable is a current workaround. This behavior seems specific to CentOS Stream 9.

How to recreate issue:

On a centos stream 9 machine, installed splunk under /opt/splunk, and run splunk as user 'splunk'.
Enable boot-start with systemd-managed 1, after stopping Splunk.

After enabling boot-start, a file will be created at /etc/systemd/system/Splunkd.service. Starting and stopping splunk using systemctl works fine, and normal.

However, if you run sudo /opt/splunk/bin/splunk [start | stop | restart], splunk itself will change the/etc/systemd/system/Splunkd.service, and create a backup with a timestamp, e.g. Splunkd.service_2023_09_21_06_49_05.

When trying to start with systemctl again: e.g.

sudo systemctl start Splunkd

 

 

Failed to start Splunkd.service: Unit Splunkd.service failed to load properly, please adjust/correct and reload service manager: Device or resource busy See system logs and 'systemctl status Splunkd.service' for details. 

 

 

This will lead to Splunk not starting after reboot, which is the whole point of enabling systemd.

 

This error message shows up, because the Splunkd.service file has been altered. To get systemctl working again, i run

sudo systemctl daemon-reload

But as soon as one tries to do a manual start|stop|restart command, the same issue arises.

 

When diffing the new service file and old service file:

diff Splunkd.service Splunkd.service_2023_09_21_06_49_05

 

 

26c26 < MemoryLimit=3723374592 --- > MemoryLimit=3723378688

 

 

memoryLimit is the only value that is changed for each subsequent 'backup' of the service file. It just switches between these two values

 

Mr chat.gpt suggested to make the service file non-immutable with

sudo chattr +i /etc/systemd/system/Splunkd.service

After this change, whenever doing manual start | stop | restart, you get a WARNING message:
triptraptresko_0-1695281935316.png

But it won't **bleep** up your Service file, and hence splunk will start after reboot. 

So it is Splunk itself who is changing the Service file. However, this issue was discovered in Centos Stream 9, and cannot be replicated in earlier versions.

Anybody know what may have caused this weird error?

Labels (1)

art-mis
Engager

@PickleRick hello, because the Splunk documentation recommends running the commands like /opt/splunk/bin/splunk start|stop|restart using sudo.
https://docs.splunk.com/Documentation/Splunk/9.1.2/Admin/RunSplunkassystemdservice#Manage_clusters_u... 

There is also the following information "Under systemd, splunk start|stop|restart commands are mapped to systemctl start|stop|restart commands." Therefore, I believe that in this case there should be no difference in how exactly the restart is carried out.

If you want to reproduce the problem yourself, here is a sample list of steps

1) Create Ubuntu 22.04 VM in Google Cloud Platform as example
2) Install Splunk Enterprise 9.1.2
dpkg -i splunk-9.1.2-b6b9c8185839-linux-2.6-amd64.deb
3) Enable Systemd Unit for Splunk
/opt/splunk/bin/splunk enable boot-start -systemd-managed 1 -user splunk -group splunk --accept-license
4) Try to do commands like /opt/splunk/bin/splunk start|stop|restart and compare with systemd unit status, you will see errors.

Well, in the end, you have commands like /opt/splunk/bin/splunk offline, which are not called through systemd.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

IMHO: even it said in docs that you must use sudo when you use "splunk start/stop/restart" it don't say that you should use that instead of use "systemctl  start/stop/restart Splunkd"! This should said more clearly here. Definitely docs feedback is needed.

After you have configured systemd into use you should use only it not directly splunk start/stop/restart. Only what you are needed is "splunk offline" when you are stopping one node on indexer cluster. All other action should use "systemctl start/stop/restart Splunkd"!

r. Ismo

Doc feedback has left.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I must say that Splunk linux packaging is sometimes sub-par (and I suppose the docs are done by more or less the same people and can contain errors.

If you have the systemd unit in place, start and stop the service using systemctl - that's what the service unit is for.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Why are you trying to start splunk with splunk start using root?

Also - if you have the systemd unit, use it.

0 Karma

art-mis
Engager

I see the same behaviour with Ubuntu 22 in GCP and Splunk Enterprise 9.1.2.
Splunk management through Systemd looks broken.

 
0 Karma
Get Updates on the Splunk Community!

Observability | How to Think About Instrumentation Overhead (White Paper)

Novice observability practitioners are often overly obsessed with performance. They might approach ...

Cloud Platform | Get Resiliency in the Cloud Event (Register Now!)

IDC Report: Enterprises Gain Higher Efficiency and Resiliency With Migration to Cloud  Today many enterprises ...

The Great Resilience Quest: 10th Leaderboard Update

The tenth leaderboard update (11.23-12.05) for The Great Resilience Quest is out &gt;&gt; As our brave ...