Splunk Enterprise

Why is the indexer stuck with CLOSE_WAIT?

PickleRick
Ultra Champion

I have a cluster which sometimes reports one of the indexers as being off-line (unable to distribute search to... bla bla bla). Usually when I connect to such indexer it is under heavy load so I just assumed that for some reason I didn't have the time so far the jobs piled up on this indexer and it will simply go away - which it usually did.

But today I had this one indexer which seemed offline but it was reported in monitoring console for next two hours or so as offline so I started to take notice.

It turns out that it got stuck on available threads for processing requests since...

# ss -ptn| grep CLOSE-WAIT | wc -l
7056

 That's not a normal state for a server. All other indexers had a nice round zero of CLOSE-WAIT connections.

These were all incoming connections to port 8089, they were not from forwarders.

And now I'm perplexed since CLOSE-WAIT is usually a sign of an app error. If it was simply a TIME-WAIT, I'd say those are just some lost FIN/ACK packets, the situation would simply return to normal after a proper timeout. But CLOSE-WAIT?

The patient is 8.1.4 on SLES 12SP3 (kernel 4.4.180-94.100-default)

Labels (1)
0 Karma
Get Updates on the Splunk Community!

Happy CX Day to our Community Superheroes!

Happy 10th Birthday CX Day!What is CX Day? It’s a global celebration recognizing innovation and success in the ...

Check out This Month’s Brand new Splunk Lantern Articles

Splunk Lantern is a customer success center providing advice from Splunk experts on valuable data insights, ...

Routing Data to Different Splunk Indexes in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...