Getting Data In

Splunk closing TCP port 9997 (forwarder port)

caphrim007
Path Finder

I upgraded to 4.3.3 on an indexer that never had any problems before this point in time and now the indexer is dropping all forwarded events on the floor with messages like this

07-11-2012 12:44:17.568 -0500 INFO TcpInputProc - Stopping IPv4 port 9997
07-11-2012 12:44:17.568 -0500 WARN TcpInputProc - Stopping all listening ports. Queues blocked for more than 300 seconds

I've seen similar questions appear like this on splunkanswers, but the suggested resolutions (involving fishbucket) dont seem to apply to my case?

I turned on splunk debugging, but it doesn't lead me to any better conclusions.

What queues is it referring to? The box is ripe with CPU, disk, and RAM. It cant possibly be overloaded; it's not doing anything.

Support is being a lame duck; taking their time staring at walls. In the meantime my primary splunk indexer is not indexing anything because it's not receiving anything from the forwarders.

Does anyone have any clues as to where I could look? If it's not resolved by tomorrow I'm re-installing splunk on the primary indexer as this is not something that can wait.

Thanks in advance for any help and guidance you can provide.

Tags (2)
0 Karma
1 Solution

hexx
Splunk Employee
Splunk Employee

The queues that are mentioned by that message are those that lead into the data pipelines where splunkd shapes your data into events before indexing those on disk.

This message would indicate that there is a bottleneck in one of those pipelines, which causes the queue that feeds it and all queues upstream to fill up, all the way to the queue that accepts incoming events from forwarders (splunktcpin).

This is obviously undesirable, but keep in mind that your forwarder events are not being dropped. Instead, the forwarders will pause their data inputs and resume once the indexer is able to process data again.

When seeing such a message, the first thing that you should do is to determine the fill percentage of the queues leading to the 4 main data pipelines : parsing -> merging -> typing -> indexing.

By determining which is the most downstream queue to be saturated, you can get an idea of why there is a bottleneck there.

A simple way to gain visibility of the state of event-processing queues is to use the "indexing performance" view of the Splunk on Splunk app. For details on how to install the app, check this Splunk Answer.

If you can post a screenshot showing the panels of that view, I can try to help you further.

Incidentally, what is the case number that you opened with Splunk support? I can check in on it for you.

View solution in original post

k_harini
Communicator

Tat was the issue for me.. Looping back to itself

0 Karma

k_harini
Communicator

What was the issue? Please help.. We are facing same issue.. If this is resolved, can you please give snippet of inputs. Conf and output. Conf files..

0 Karma

KpiBuff
Explorer

I try to avoid staring at walls whenever I can 😉

0 Karma

mikelanghorst
Motivator

It's a shot in the dark without more information, but I had this issue before. Are you using the deployment server in your environment? Is it possible your forwarders' outputs.conf got deployed to your indexer?

On the indexer:
./splunk cmd btool outputs list --debug

See if you're somehow looping your inputs back to itself.

hexx
Splunk Employee
Splunk Employee

That would be consistent with the high-level symptom described.

0 Karma

hexx
Splunk Employee
Splunk Employee

The queues that are mentioned by that message are those that lead into the data pipelines where splunkd shapes your data into events before indexing those on disk.

This message would indicate that there is a bottleneck in one of those pipelines, which causes the queue that feeds it and all queues upstream to fill up, all the way to the queue that accepts incoming events from forwarders (splunktcpin).

This is obviously undesirable, but keep in mind that your forwarder events are not being dropped. Instead, the forwarders will pause their data inputs and resume once the indexer is able to process data again.

When seeing such a message, the first thing that you should do is to determine the fill percentage of the queues leading to the 4 main data pipelines : parsing -> merging -> typing -> indexing.

By determining which is the most downstream queue to be saturated, you can get an idea of why there is a bottleneck there.

A simple way to gain visibility of the state of event-processing queues is to use the "indexing performance" view of the Splunk on Splunk app. For details on how to install the app, check this Splunk Answer.

If you can post a screenshot showing the panels of that view, I can try to help you further.

Incidentally, what is the case number that you opened with Splunk support? I can check in on it for you.

caphrim007
Path Finder

Sideview Utils. Next time I'll read before asking

0 Karma

caphrim007
Path Finder

Appears I have it installed, but when going to use it, I get this error.

Splunk encountered the following unknown module: "sosFTR" . The view may not load properly

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...