Hello Splunkers,
A few days ago most of serverclasses on our Deployment Server uninstalled itself an output app.
As a result, splunkd was restarted on UFs and data stopped being forwarded from hosts.
For info, each serverclass in our environment consists of a deployment app with inputs.conf where we specify sources and another deployment app called 'output_app' with outputs.conf to get data forwarded to indexer cluster.
Example logs from one of affected UFs:
06-29-2022 12:15:47.893 +0200 INFO DeployedServerclass - Serverclass=inputs_test_prod is uninstalling app=/opt/splunkforwarder/etc/apps/output_app
06-29-2022 12:15:47.893 +0200 INFO DeployedApplication - Removing app=output_app at='/opt/splunkforwarder/etc/apps/output_app'
06-29-2022 12:15:47.904 +0200 WARN DC:DeploymentClient - Restarting Splunkd...
06-29-2022 12:15:47.905 +0200 WARN Restarter - Splunkd is configured to run as a systemd service, skipping external restart process
06-29-2022 12:15:47.905 +0200 INFO HttpPubSubConnection - Running phone uri=/services/broker/phonehome/connection_123.456.7.89_8089_z1il0123.xyz.ai_z1il0123.zyx.ai_95C4E8F1-731A-4280-9F09-93B03EAFB3DE
06-29-2022 12:15:48.206 +0200 INFO loader - Shutdown HTTPDispatchThread
06-29-2022 12:15:48.206 +0200 INFO ShutdownHandler - Shutting down splunkd
06-29-2022 12:15:48.206 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_Begin"
06-29-2022 12:15:48.207 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_FileIntegrityChecker"
06-29-2022 12:15:48.207 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_JustBeforeKVStore"
06-29-2022 12:15:48.207 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_KVStore"
06-29-2022 12:15:48.207 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_DFM"
06-29-2022 12:15:48.207 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_Thruput"
06-29-2022 12:15:48.207 +0200 INFO ShutdownHandler - shutting down level "ShutdownLevel_TcpInput1"
06-29-2022 12:15:48.207 +0200 INFO TcpInputProc - Running shutdown level 1. Closing listening ports.
06-29-2022 12:15:48.207 +0200 INFO TcpInputProc - Done setting shutdown in progress signal.
outputs.conf
# Turn off indexing on the master
[indexAndForward]
index = false
[tcpout]
defaultGroup = splunk_prod
forwardedindex.filter.disable = true
indexAndForward = false
[tcpout:splunk_prod]
server=z1il0001.zyx.ai.zz:9997,z1il0002.zyx.ai.zz:9997, z1il0003.zyx.ai.zz:9997, z1il0004.zyx.ai.zz:9997, z1il0005.zyx.ai.zz:9997, z1il0006.zyx.ai.zz:9997
autoLB = true
Have you ever encountered such an issue? How it is possible that serverclass gets rids off an app itself?
Last changes that we did was a Deployment Server upgrade from 8.2.3.3 to 9.0, but we did it on 24.06.
Any idea what can be a root cause?
Greetings,
Dzasta
Hi @justynap_ldz,
after you uninstalled an app from a serverclass, have you this app in the other serverclasses or it was uninstalled from all the serverclasses?
As I said, sometimes unistalling an app from a serverclass it's uninstalled from all the serverclasses.
Ciao.
Giuseppe
Hi @gcusello,
your quick responses are very much appreciated! Thank you 😉
Yes, that is the only reasonable explanation for our issue. With other Splunk Admins in my team we found out that opening a serverclass in Forwarder Management, going to App section, clicking on Edit app and then Uninstall, may actually uninstall the app from all serverclasses ( 😮 )
Hi @justynap_ldz,
I fell into this mistake some time ago, which is why I went almost without fail!
good for you, see next time!
Ciao and happy splunking
Giuseppe
P.S.: Karma Points are appreciated 😉
Hi @justynap_ldz,
in my experience, when a serverclass uninstall an app is alway because the admin made an error in configuration.
One of the most common errors is dete an app from a serverclass and instead the app is deleted from all serverclasses.
Anyway, check the serverclasses where the uninstalled App is (or should be) present and check them.
Ciao.
Giuseppe
Hi @gcusello,
That was my first guess, too. However, we are doing such changes in Splunk Web (I mean when we delete an app from serverclass), so it is technically not possible to uninstall one output app from over 100 serverclasses at once.
I am not able to find any reasonable explanation for this issue...
Hi @justynap_ldz,
after you uninstalled an app from a serverclass, have you this app in the other serverclasses or it was uninstalled from all the serverclasses?
As I said, sometimes unistalling an app from a serverclass it's uninstalled from all the serverclasses.
Ciao.
Giuseppe