Splunk Enterprise

Stopping Splunk is taking very long to stop

snosurfur
Engager

Stopping splunkd is taking up to 6 minutes to complete.  We have a process that snapshots the instance and we are stopping splunkd prior to taking that snapshot.  Previously with v9.0.1 we did not experience this; now we are on v9.2.1.

While shutting down I am monitoring spklunkd.log and the only errors I am seeing has to do with the HFs.  'TcpInputProc [65700 tcp] - Waiting for all connections to close before shutting down TcpInputProcessor '.

Has anyone else experienced something similar post upgrade?

 

0 Karma

Akeydel
Explorer

I get this too. 

In Splunkd.log, we see the shutdown process, but then it just... doesn't shut down... until it times out.
Looks like the shutdown process completes, but the HttpPubSubConnection keeps going.

Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_Tailing"
08-15-2024 22:27:57.171 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="TailingProcessor"
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182482 Shutdown] - Will reconfigure input.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182482 Shutdown] - Calling addFromAnywhere in TailWatcher=0x7f4e53dfd
a10.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Shutting down with TailingShutdownActor=0
x7f4e77429300 and TailWatcher=0x7f4e53dfda10.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Pausing TailReader module...
08-15-2024 22:27:57.171 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 0 to 1 (pseudoPause).
08-15-2024 22:27:57.171 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 0 to 1 (pseudoPause).
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Removing TailWatcher from eventloop...
08-15-2024 22:27:57.176 +0000 INFO TailingProcessor [182712 MainTailingThread] - ...removed.
08-15-2024 22:27:57.176 +0000 INFO TailingProcessor [182712 MainTailingThread] - Eventloop terminated successfully.
08-15-2024 22:27:57.177 +0000 INFO TailingProcessor [182712 MainTailingThread] - Signaling shutdown complete.
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 1 to 2 (signalShutdown).
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - Shutting down batch-reader
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 1 to 2 (signalShutdown).
08-15-2024 22:27:57.177 +0000 INFO Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_IdataDO_Collector"
08-15-2024 22:27:57.177 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="IdataDO_Collector"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_PeerManager"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="BundleStatusManager"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="DistributedPeerManager"
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182624 TcpPQReaderThread] - TcpInput queue shut down cleanly.
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182624 TcpPQReaderThread] - Reader thread stopped.
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182623 TcpListener] - TCP connection cleanup complete
08-15-2024 22:28:52.001 +0000 INFO HttpPubSubConnection ..... 
...
...
INFO IndexProcessor [199494 MainThread] - handleSignal : Disabling streaming searches.

Splunk continues to write log lines from HttpPubSubConnection - Running phone.... after the Shutdown, nothing else shows up in the logs.  I re-ran "./splunk stop" in another session, and it finally logged one more line and actually stopped.

0 Karma

hrawat
Splunk Employee
Splunk Employee

@snosurfur wrote:

Stopping splunkd is taking up to 6 minutes to complete. 

 with the HFs.  'TcpInputProc [65700 tcp] - Waiting for all connections to close before shutting down TcpInputProcessor '.

Has anyone else experienced something similar post upgrade?

 


Anything changed on sending (UF/HF) side?
HF(receiver) waits for sender to disconnect gracefully before it force terminates connections after waiting for ~110 sec(default).

isoutamo
SplunkTrust
SplunkTrust
Are there lately added more UFs and/or HFs to send logs to your indexer?
0 Karma

splunkreal
Motivator

Hello, is it indexer? If yes it can happen, if it's search head you can force stopping current searches.

* If this helps, please upvote or accept solution if it solved *
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Take Action Automatically on Splunk Alerts with Red Hat Ansible Automation Platform

 Are you ready to revolutionize your IT operations? As digital transformation accelerates, the demand for ...

Calling All Security Pros: Ready to Race Through Boston?

Hey Splunkers, .conf25 is heading to Boston and we’re kicking things off with something bold, competitive, and ...

Beyond Detection: How Splunk and Cisco Integrated Security Platforms Transform ...

Financial services organizations face an impossible equation: maintain 99.9% uptime for mission-critical ...