Splunk Enterprise

Stopping Splunk is taking very long to stop

snosurfur
Engager

Stopping splunkd is taking up to 6 minutes to complete.  We have a process that snapshots the instance and we are stopping splunkd prior to taking that snapshot.  Previously with v9.0.1 we did not experience this; now we are on v9.2.1.

While shutting down I am monitoring spklunkd.log and the only errors I am seeing has to do with the HFs.  'TcpInputProc [65700 tcp] - Waiting for all connections to close before shutting down TcpInputProcessor '.

Has anyone else experienced something similar post upgrade?

 

0 Karma

Akeydel
Explorer

I get this too. 

In Splunkd.log, we see the shutdown process, but then it just... doesn't shut down... until it times out.
Looks like the shutdown process completes, but the HttpPubSubConnection keeps going.

Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_Tailing"
08-15-2024 22:27:57.171 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="TailingProcessor"
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182482 Shutdown] - Will reconfigure input.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182482 Shutdown] - Calling addFromAnywhere in TailWatcher=0x7f4e53dfd
a10.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Shutting down with TailingShutdownActor=0
x7f4e77429300 and TailWatcher=0x7f4e53dfda10.
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Pausing TailReader module...
08-15-2024 22:27:57.171 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 0 to 1 (pseudoPause).
08-15-2024 22:27:57.171 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 0 to 1 (pseudoPause).
08-15-2024 22:27:57.171 +0000 INFO TailingProcessor [182712 MainTailingThread] - Removing TailWatcher from eventloop...
08-15-2024 22:27:57.176 +0000 INFO TailingProcessor [182712 MainTailingThread] - ...removed.
08-15-2024 22:27:57.176 +0000 INFO TailingProcessor [182712 MainTailingThread] - Eventloop terminated successfully.
08-15-2024 22:27:57.177 +0000 INFO TailingProcessor [182712 MainTailingThread] - Signaling shutdown complete.
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 1 to 2 (signalShutdown).
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - Shutting down batch-reader
08-15-2024 22:27:57.177 +0000 INFO TailReader [182712 MainTailingThread] - State transitioning from 1 to 2 (signalShutdown).
08-15-2024 22:27:57.177 +0000 INFO Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_IdataDO_Collector"
08-15-2024 22:27:57.177 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="IdataDO_Collector"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down level="ShutdownLevel_PeerManager"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="BundleStatusManager"
08-15-2024 22:27:57.178 +0000 INFO Shutdown [182482 Shutdown] - shutting down name="DistributedPeerManager"
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182624 TcpPQReaderThread] - TcpInput queue shut down cleanly.
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182624 TcpPQReaderThread] - Reader thread stopped.
08-15-2024 22:27:57.692 +0000 INFO TcpInputProc [182623 TcpListener] - TCP connection cleanup complete
08-15-2024 22:28:52.001 +0000 INFO HttpPubSubConnection ..... 
...
...
INFO IndexProcessor [199494 MainThread] - handleSignal : Disabling streaming searches.

Splunk continues to write log lines from HttpPubSubConnection - Running phone.... after the Shutdown, nothing else shows up in the logs.  I re-ran "./splunk stop" in another session, and it finally logged one more line and actually stopped.

0 Karma

hrawat
Splunk Employee
Splunk Employee

@snosurfur wrote:

Stopping splunkd is taking up to 6 minutes to complete. 

 with the HFs.  'TcpInputProc [65700 tcp] - Waiting for all connections to close before shutting down TcpInputProcessor '.

Has anyone else experienced something similar post upgrade?

 


Anything changed on sending (UF/HF) side?
HF(receiver) waits for sender to disconnect gracefully before it force terminates connections after waiting for ~110 sec(default).

isoutamo
SplunkTrust
SplunkTrust
Are there lately added more UFs and/or HFs to send logs to your indexer?
0 Karma

splunkreal
Motivator

Hello, is it indexer? If yes it can happen, if it's search head you can force stopping current searches.

* If this helps, please upvote or accept solution if it solved *
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...