Stopping Splunk is taking very long to stop

snosurfur · ‎05-09-2024

Stopping splunkd is taking up to 6 minutes to complete. We have a process that snapshots the instance and we are stopping splunkd prior to taking that snapshot. Previously with v9.0.1 we did not experience this; now we are on v9.2.1.

While shutting down I am monitoring spklunkd.log and the only errors I am seeing has to do with the HFs. 'TcpInputProc [65700 tcp] - Waiting for all connections to close before shutting down TcpInputProcessor '.

Has anyone else experienced something similar post upgrade?

Akeydel · ‎08-15-2024

I get this too.

In Splunkd.log, we see the shutdown process, but then it just... doesn't shut down... until it times out.
Looks like the shutdown process completes, but the HttpPubSubConnection keeps going.

Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_Tailing"
08-15-2024 22:27:57.171 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="TailingProcessor"
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182482 Shutdown] - Will reconfigure input.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182482 Shutdown] - Calling addFromAnywhere in TailWatcher=0x7f4e53dfd
a10.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Shutting down with TailingShutdownActor=0
x7f4e77429300 and TailWatcher=0x7f4e53dfda10.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Pausing TailReader module...
08-15-2024 22:27:57.171 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 0 to 1 (pseudoPause).
08-15-2024 22:27:57.171 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 0 to 1 (pseudoPause).
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Removing TailWatcher from eventloop...
08-15-2024 22:27:57.176 +0000 INFO TailingProcessor [182712 MainTailingThread] - ...removed.
08-15-2024 22:27:57.176 +0000 INFO TailingProcessor [182712 MainTailingThread] - Eventloop terminated successfully.
08-15-2024 22:27:57.177 +0000 INFO TailingProcessor [182712 MainTailingThread] - Signaling shutdown complete.
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 1 to 2 (signalShutdown).
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - Shutting down batch-reader
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 1 to 2 (signalShutdown).
08-15-2024 22:27:57.177 +0000 INFO Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_IdataDO_Collector"
08-15-2024 22:27:57.177 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="IdataDO_Collector"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_PeerManager"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="BundleStatusManager"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="DistributedPeerManager"
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182624 TcpPQReaderThread] - TcpInput queue shut down cleanly.
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182624 TcpPQReaderThread] - Reader thread stopped.
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182623 TcpListener] - TCP connection cleanup complete
08-15-2024 22:28:52.001 +0000 INFO HttpPubSubConnection ..... 
...
...
INFO IndexProcessor [199494 MainThread] - handleSignal : Disabling streaming searches.

Splunk continues to write log lines from HttpPubSubConnection - Running phone.... after the Shutdown, nothing else shows up in the logs. I re-ran "./splunk stop" in another session, and it finally logged one more line and actually stopped.

hrawat · ‎06-02-2024

@snosurfur wrote:
Stopping splunkd is taking up to 6 minutes to complete.

with the HFs. 'TcpInputProc [65700 tcp] - Waiting for all connections to close before shutting down TcpInputProcessor '.
Has anyone else experienced something similar post upgrade?

Anything changed on sending (UF/HF) side?
HF(receiver) waits for sender to disconnect gracefully before it force terminates connections after waiting for ~110 sec(default).

isoutamo · ‎06-03-2024

Are there lately added more UFs and/or HFs to send logs to your indexer?

splunkreal · ‎06-01-2024

Hello, is it indexer? If yes it can happen, if it's search head you can force stopping current searches.

* If this helps, please upvote or accept solution if it solved *

Stopping Splunk is taking very long to stop

administration

configuration

troubleshooting

upgrade

Can’t make it to .conf25? Join us online!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Calling All Security Pros: Ready to Race Through Boston?

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Are you a member of the Splunk Community?

Stopping Splunk is taking very long to stop

administration

configuration

troubleshooting

upgrade

Can’t make it to .conf25? Join us online!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

Calling All Security Pros: Ready to Race Through Boston?

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...