Splunk Enterprise

splunk upgrade 7->8.0 failure

gauravmsharma
Path Finder

Splunk upgrade process seems to be very confusing from 7->8.

I stop splunk using a systemctl splunk stop to stop the services because if i stop using the splunk user it starts again since splunk is configured to as systemd service. 

Edit the splunkd.service file as root as the new splunkd service file should not contain user=splunk and other commands. I use the file given by splunk here

https://docs.splunk.com/Documentation/Splunk/8.0.3/Admin/RunSplunkassystemdservice

I am using a rpm based install and i use:

rpm -i --replacepkgs --prefix=/splunkdirectory/ splunk_package_name.rpm

This is use to replace the exsisting install package of 7 and the new package is 8. This command allowed to be executed as splunk user and i need to be a root user else i get error if i run as non root user.

error: can't create transaction lock on /var/lib/rpm/.rpm.lock (Permission denied)

Next i start the splunk as per the upgrade recommendation from splunk

sudo splunk start

 

https://docs.splunk.com/Documentation/Splunk/8.0.3/Admin/RunSplunkassystemdservice#Upgrade_considera...

 

This gives me lots of "Invalid key in stanza" while starting , also the splunk is running as a root process now

 

-bash-4.2$ ps -ef | grep splunk
root 31719 31229 0 18:08 pts/0 00:00:00 sudo su - splunk
root 31721 31719 0 18:08 pts/0 00:00:00 su - splunk
splunk 31722 31721 0 18:08 pts/0 00:00:00 -bash
root 31806 1 8 18:09 ? 00:00:04 splunkd -p 8089 start
root 31808 31806 0 18:09 ? 00:00:00 [splunkd pid=31806] splunkd -p 8089 start [process-runner]
root 31834 31808 0 18:09 ? 00:00:00 mongod --dbpath=/opt/splunk/var/lib/splunk/kvstore/mongo --port=8191 --timeStampFormat=iso8601-utc --smallfiles --oplogSize=200 --keyFile=/opt/splunk/var/lib/splunk/kvstore/mongo/splunk.key --setParameter=enableLocalhostAuthBypass=0 --replSet=A818B836-060F-4BA2-A42E-82AE5CF11FFA --sslMode=requireSSL --sslAllowInvalidHostnames --sslPEMKeyFile=/opt/splunk/etc/auth/server.pem --sslPEMKeyPassword=xxxxxxxx --sslDisabledProtocols=noTLS1_0,noTLS1_1 --sslCipherConfig=ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDH-ECDSA-AES256-GCM-SHA384:ECDH-ECDSA-AES128-GCM-SHA256:ECDH-ECDSA-AES128-SHA256:AES256-GCM-SHA384:AES128-GCM-SHA256:AES128-SHA256 --nounixsocket --noscripting
root 31938 31808 2 18:09 ? 00:00:01 /opt/splunk/bin/python -O /opt/splunk/lib/python2.7/site-packages/splunk/appserver/mrsparkle/root.py --proxied=127.0.0.1,8065,8443
root 31996 31808 0 18:10 ? 00:00:00 /opt/splunk/bin/splunkd instrument-resource-usage -p 8089 --with-kvstore
splunk 32063 31722 0 18:10 pts/0 00:00:00 ps -ef
splunk 32064 31722 0 18:10 pts/0 00:00:00 grep --color=auto splunk

 

Tried to stop the splunk process and run again as user splunk, splunk process starts ok but the splunk daemon is dead

systemctl status splunk
● splunkd.service - Systemd service file for Splunk, generated by 'splunk enable boot-start'
Loaded: loaded (/etc/systemd/system/splunkd.service; enabled; vendor preset: disabled)
Active: inactive (dead) (Result: exit-code)

After starting the splunk daemon it is still in failed state, complaining about permissions.

 

splunkd.service - Systemd service file for Splunk, generated by 'splunk enable boot-start'
Loaded: loaded (/etc/systemd/system/splunkd.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Mon 2020-09-14 18:17:07 UTC; 871ms ago
Process: 32730 ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd (code=exited, status=4)
Main PID: 32730 (code=exited, status=4)

 systemd[1]: splunkd.service: main process exited, code=exited, status=4/NOPERMISSION
 systemd[1]: Unit splunkd.service entered failed state.
 systemd[1]: splunkd.service failed.
 splunkd.service holdoff time over, scheduling restart.
 systemd[1]: Stopped Systemd service file for Splunk, generated by 'splunk enable boot-start'.
 systemd[1]: start request repeated too quickly for splunkd.service
 Failed to start Systemd service file for Splunk, generated by 'splunk enable boot-start'.
 systemd[1]: Unit splunkd.service entered failed state.
 systemd[1]: splunkd.service failed.

Has anyone faced these same issues? Am i working in the correct order or do i need to change the order or am i missing something in between?

Labels (1)
0 Karma

gauravmsharma
Path Finder

It failed with the method mentioned by @isoutamo  as well as @maraman_splunk 

0 Karma

gauravmsharma
Path Finder

systemctl stop splunkd //run as root
works fine
update splunk using rpm and tgz method //run a root
works fine
chown -R splunk:splunk /opt/splunk/
works fine
changed the content of splunkd.service file based on latest content mentioned on site plus the two lines mentioned above
(as splunk) /...path/to/splunk/bin/splunk start --accept-license --answer-yes
Splunk Software License Agreement 10.21.2019
Failed with below message:

This appears to be an upgrade of Splunk.
--------------------------------------------------------------------------------)

Splunk has detected an older version of Splunk installed on this machine. To
finish upgrading to the new version, Splunk's installer will automatically
update and alter your current configuration files. Deprecated configuration
files will be renamed with a .deprecated extension.

You can choose to preview the changes that will be made to your configuration
files before proceeding with the migration and upgrade:

If you want to migrate and upgrade without previewing the changes that will be
made to your existing configuration files, choose 'y'.
If you want to see what changes will be made before you proceed with the
upgrade, choose 'n'.


Perform migration and upgrade without previewing configuration changes? [y/n] y

-- Migration information is being logged to '/opt/splunk/var/log/splunk/migration.log.2020-10-23.19-53-25' --

Migrating to:
VERSION=8.0.2.1
BUILD=f002026bad55
PRODUCT=splunk
PLATFORM=Linux-x86_64

Copying '/opt/splunk/etc/myinstall/splunkd.xml' to '/opt/splunk/etc/myinstall/splunkd.xml-migrate.bak'.

Checking saved search compatibility...

Handling deprecated files...

Checking script configuration...

Copying '/opt/splunk/etc/myinstall/splunkd.xml.cfg-default' to '/opt/splunk/etc/myinstall/splunkd.xml'.
Deleting '/opt/splunk/etc/system/local/field_actions.conf'.

The following apps might contain lookup table files that are not exported to other apps:

splunk_monitoring_console

Such lookup table files could only be used within their source app. To export them globally and allow other apps to access them, add the following stanza to each /opt/splunk/etc/apps/<app_name>/metadata/local.meta file:

[lookups]
export = system

For more information, see http://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/SetPermissions#Make_objects_globally_....

An error occurred: Failed to run splunkd rest:
stdout:
--
stderr:splunkd: /opt/splunk/src/util/HttpClientRequest.cpp:1760: void HttpClientTransaction::_handleProxyConnect(): Assertion `_poolp->hasSslContext()' failed.
Dying on signal #6 (si_code=-6), sent by PID 1657 (UID 1002). Attempting to clean up pidfile

--

First-time run failed!

Method 2

1)(as root) Stop it: systemctl stop splunkd

2)
/opt/splunk/bin/splunk disable boot-start
failed with below message
/opt/splunk/bin/splunk disable boot-start
error reading information on service splunk: No such file or directory
Disabled.

the splunkd.service file still exsists not deleted though.

3) performed the upgrade as mentioned with method above

4) /opt/splunk/bin/splunk enable boot-start -systemd-managed 1 -user splunk

failed with error message below

Perform migration and upgrade without previewing configuration changes? [y/n] y

-- Migration information is being logged to '/opt/splunk/var/log/splunk/migration.log.2020-10-23.19-47-04' --

Migrating to:
VERSION=8.0.2.1
BUILD=f002026bad55
PRODUCT=splunk
PLATFORM=Linux-x86_64

Copying '/opt/splunk/etc/myinstall/splunkd.xml' to '/opt/splunk/etc/myinstall/splunkd.xml-migrate.bak'.

Checking saved search compatibility...

Handling deprecated files...

Checking script configuration...

Copying '/opt/splunk/etc/myinstall/splunkd.xml.cfg-default' to '/opt/splunk/etc/myinstall/splunkd.xml'.
Deleting '/opt/splunk/etc/system/local/field_actions.conf'.

The following apps might contain lookup table files that are not exported to other apps:

splunk_monitoring_console

Such lookup table files could only be used within their source app. To export them globally and allow other apps to access them, add the following stanza to each /opt/splunk/etc/apps/<app_name>/metadata/local.meta file:

[lookups]
export = system

For more information, see http://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/SetPermissions#Make_objects_globally_....

An error occurred: Failed to run splunkd rest:
stdout:
--
stderr:splunkd: /opt/splunk/src/util/HttpClientRequest.cpp:1760: void HttpClientTransaction::_handleProxyConnect(): Assertion `_poolp->hasSslContext()' failed.
Dying on signal #6 (si_code=-6), sent by PID 1572 (UID 1002). Attempting to clean up pidfile

--
Also tried to delete the splunkd.file manually and recreate based on content mentioned but failed again. Also after failure removes the splunk.service file, which i created manually.

0 Karma

maraman_splunk
Splunk Employee
Splunk Employee

Humm, not sure what is your systemd file at the moment....

One fix to fix this may be to:

as root :

 

/opt/splunk/bin/splunk disable boot-start

 

Note : you obviously need to adapt the home path to your splunk installation dir     

(this should remove the system file)

then recreate it from v8

 

/opt/splunk/bin/splunk enable boot-start -systemd-managed 1 -user splunk

which should recreate the systemd file with the one from v8

 

Alternatively you could also just add to the systemd unit file

 

# change needed for 8.0+ ExecStartPost to ExecStartPre to change the permissions before Splunk is started (replace id with the ones from splunk user and group)
ExecStartPre=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/cpu/system.slice/%n"
ExecStartPre=/bin/bash -c "chown -R 1001:1001 /sys/fs/cgroup/memory/system.slice/%n"

 

 

followed by

 

systemctl daemon-reload
systemctl restart Splunkd.service 

 

 

+ if ever Splunk wrote files as root then stop splunk and give back files to splunk before starting again the service as splunk

 

0 Karma

gauravmsharma
Path Finder

My exsisting splunkd.service file looks like this:

[Unit]
Description=Splunk Enterprise 7.0.0
After=network.target
Wants=network.target

[Service]
Type=forking
Restart=always
RestartSec=30s
User=splunk
Group=splunk
LimitNOFILE=65536
LimitNPROC=16384
TimeoutSec=300
ExecStart=/opt/splunk/bin/splunk start --accept-license --answer-yes --no-prompt
ExecStop=/opt/splunk/bin/splunk stop
ExecReload=/opt/splunk/bin/splunk restart
PIDFile=/opt/splunk/var/run/splunk/splunkd.pid

[Install]
WantedBy=multi-user.target
# If you want to use $(systemctl [start|stop|restart] splunk) instead of splunkd ...
Alias=splunk.service

Tags (1)
0 Karma

gauravmsharma
Path Finder

Plus this is the new file content which i created while the splunk is in stop state (on splunk system 8.0)

#This unit file replaces the traditional start-up script for systemd
#configurations, and is used when enabling boot-start for Splunk on
#systemd-based Linux distributions.

[Unit]
Description=Systemd service file for Splunk, generated by 'splunk enable boot-start'
After=network.target

[Service]
Type=simple
Restart=always
ExecStart=/opt/splunk/bin/splunk _internal_launch_under_systemd
KillMode=mixed
KillSignal=SIGINT
TimeoutStopSec=360
LimitNOFILE=65536
SuccessExitStatus=51 52
RestartPreventExitStatus=51
RestartForceExitStatus=52
Delegate=true
CPUShares=1024

[Install]
WantedBy=multi-user.target

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

basically what you should do when you are upgrading splunk.

  • (as root) Stop it: systemctl stop splunkd (or what ever the service name is in your system)
  • (as root) update it e.g. rpm .... / (yum or dnf is preffered in newer RH based systems)
  • (as root) chown -R splunk:splunk /..../path/to/splunk_dir
  • (as splunk) /...path/to/splunk/bin/splunk start --accept-license --answer-yes
  • (as splunk) ..../splunk/stop
  • (as root) systemctl start splunkd (or what ever you have named it)

Now as @maraman_splunk propose stop, disable boot start, chown, start, stop, enable boot start and start it. Then it should be ok. Of course if you have apps or dashboards etc. you have already checked that those are compatible for version 8....

And currently there are some parameters/attributes on splunk.service file which needs/should change before it has taken into use. You could found those by google quite easy.

r. Ismo

 

0 Karma
Get Updates on the Splunk Community!

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

The latest enhancements across the Splunk Observability portfolio deliver greater flexibility, better data and ...

Alerting Best Practices: How to Create Good Detectors

At their best, detectors and the alerts they trigger notify teams when applications aren’t performing as ...

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...