All Apps and Add-ons

Splunk App for Windows Infrastructure: Forwarding to indexer group default-autolb-group blocked

alancar
Explorer

Hi there,

I have been trying to set up a basic infrastructure for the Splunk App for Windows Infrastructure using the following link:

http://docs.splunk.com/Documentation/MSApp/1.1.0/MSInfra/AbouttheSplunkAppforMSInfrastructure

My sandbox environment (1 server + 1 client; both Windows Server 2012 R2) indicates that the sendtoindexer and Splunk_TA_windows apps have been successfully deployed to the single client.

However, when I try to confirm data collection via Search & Reporting, no results are returned.

I also notice that there is a message on top of the screen suggesting blocked forwarding as per the subject title.

Any suggestions what I may be missing?

Thanks,
Alan

0 Karma
1 Solution

malmoore
Splunk Employee
Splunk Employee

This issue is likely the result of an error in the instructions on setting up the 'send to indexer' app that eventually gets deployed to all universal forwarders in a Splunk App for Windows Infrastructure deployment.

A forwarder cannot forward things to itself. When you attempt to do this, messages such as the one in this question appear. As part of setup of the "send to indexer" app, you were instructed to create an outputs.conf that directs forwarders to forward to one or more indexers. If you are working on the same machine, as is assumed in these instructions, then by creating the app on the machine, you unintentionally create a 'forwarder loop' if you follow the steps in the next topic, which says to copy the app out of the apps directory and into the deployment apps directory.

Instead, you should move the app out of the apps directory and into the deployment apps directory. Since a deployment server cannot be a client of itself, it will never receive the "send to indexer" app and this forwarding loop can be avoided.

The documentation has been fixed. Apologies for any inconvenience caused.

View solution in original post

malmoore
Splunk Employee
Splunk Employee

This issue is likely the result of an error in the instructions on setting up the 'send to indexer' app that eventually gets deployed to all universal forwarders in a Splunk App for Windows Infrastructure deployment.

A forwarder cannot forward things to itself. When you attempt to do this, messages such as the one in this question appear. As part of setup of the "send to indexer" app, you were instructed to create an outputs.conf that directs forwarders to forward to one or more indexers. If you are working on the same machine, as is assumed in these instructions, then by creating the app on the machine, you unintentionally create a 'forwarder loop' if you follow the steps in the next topic, which says to copy the app out of the apps directory and into the deployment apps directory.

Instead, you should move the app out of the apps directory and into the deployment apps directory. Since a deployment server cannot be a client of itself, it will never receive the "send to indexer" app and this forwarding loop can be avoided.

The documentation has been fixed. Apologies for any inconvenience caused.

View solution in original post

esix_splunk
Splunk Employee
Splunk Employee

Can you confirm you're trying to connect to tcp/9997 on the server and not tcp/9777. That would cause it...

0 Karma

satishsdange
Builder

Did you follow below steps -

Configure the Splunk Add-on for Windows

Before the add-on can collect Windows data, you must configure it.

  1. In the location where you unarchived the download file, locate the Splunk_TA_Windows directory.

  2. Inside this directory, make a subdirectory local.

  3. Copy the inputs.conf file in the default subdirectory to the local directory.

  4. Open the inputs.conf in the local subdirectory with a text editor, such as Notepad.

5. Enable the Windows inputs you want to get data for. Do this by changing the value of the disabled attribute in each input stanza from 1 to 0.

http://docs.splunk.com/Documentation/MSApp/1.1.1/MSInfra/DownloadandconfiguretheSplunkAdd-onforWindo...

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Can you include your inputs on the indexer and outputs on the client? Additionally check the _internal index and see if there are any errors that stand out.

0 Karma

alancar
Explorer

Windows Firewall on the server is completed turned off.

Doing a netstat -a on the server gives the following:

TCP 172.16.1.1:9997 splunk-server:49556 ESTABLISHED
TCP 172.16.1.1:9997 SPLUNK-CLIENT:49168 ESTABLISHED

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

The configurations seem correct, however can you clarify which of the above is from where? There shouldnt be any local host connections from splunk-server to splunk-server:9997 (unless you have a local UF/HF on that machine running as separate instance on the same box... e.g., no outputs.conf on the splunk-server instance..)
Also, are you trying to use SSL or any other non-default settings?

Troubleshooting this, you can confirm the following:
1) From the splunk-client: telnet to splunk-server 9997, and confirm you have TCP connectivity
2) From the splunk-client: $splunk_home/bin/splunk btool list outputs --debug (and confirm contents match the expected outputs. You can also do a $splunk_home/bin/splunk list forward-server

From splunk-server
1) $splunk_home/bin/splunk btool list inputs --debug (and confirm contents match the expected inputs.)
2) Check in the _internal log and see if you are getting any logs from the splunk-client server.

0 Karma

alancar
Explorer

Splunk-Server = indexer with "send to indexer" app + deployment server
Splunk-Client = Windows host with universal forwarder

Splunk-server has an outputs.conf based on the steps of 'Create the "send to indexer" app' section:

http://docs.splunk.com/Documentation/MSApp/1.1.0/MSInfra/Createthesendtoindexerapp

Netstat shows that port 9997 in splunk-server is being used by the local Splunkd Service. This instance is using ports 49178, 8191, 49305, 49160, & 9997. Another splunk instance is using ports 8191.

I cannot find any suspicious entries on the outputs.conf files on splunk-client and the inputs.conf file on splunk-server. (I prefer not to post the whole content of the text file for now so not to spam the forum but can post specific sections upon request).

No non-default settings have been introduced into the environment.

I can telnet to splunk-server:9997 from the local machine but cannot from splunk-client with a "Could not open connection to the host, on port 9997: Connect failed" error.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

I can telnet to splunk-server:9997 from the local machine but cannot from splunk-client with a "Could not open connection to the host, on port 9997: Connect failed" error.

This is indicative of some kind of firewall setting that is blocking remote connections to TCP9997 on the indexer from remote connections. You need to investigate this. Perhaps you running this as a user that doesnt have proper permissions or perhaps the splunk-client machine isnt allowing an outbound connection.

In windows2012 there is a powershell command to disable/enable the firewall:

netsh advfirewall set allprofiles state off
netsh advfirewall set allprofiles state on

I'd run that on both..

Additionally, you do not need a send to indexer app on your indexer instance. Splunk is aware of local indexes and in a non-distributed environment, you dont need to send to itself.

0 Karma

alancar
Explorer

I disabled the firewall using netsh on both machines but this did not resolve the problem.

Interestingly, from the client, I can telnet to port 8000 of splunk-server. The connectivity problem is specific to port 9777.

0 Karma

alancar
Explorer

That's a typo. Sorry.

I tried to telnet to port 9997 but still failed.

I decided to reboot and re-apply the netsh advfirewall setting to disable the firewall on both servers.

For the first 300 seconds, I can telnet to port 9997 of the server from the client.

I still get the "Forwarding to indexer group default-autolb-group blocked for xxx seconds." message from the 100th second after reboot.

After 300 seconds, my existing remote port 9997 telnet connection will get disconnected and I can no longer reconnect. Looking at the splunkd log, I can see the following:

WARN TcpOutputProc - Forwarding to indexer group default-autolb-group blocked for 300 seconds.
INFO TcpInputProc - Stopping IPv4 port 9997
WARN TcpInputProc - Stopping all listening ports. Queues blocked for more than 300 seconds
WARN TcpOutputFd - Connect to 172.16.1.1:9997 failed. No connection could be made because the target machine actively refused it.
ERROR TcpOutputFd - Connection to host=172.16.1.1:9997 failed
INFO TcpOutputProc - Detected connection to 172.16.1.1:9997 closed
INFO TcpOutputProc - Will close stream to current indexer 172.16.1.1:9997
INFO TcpOutputProc - Closing stream for idx=172.16.1.1:9997

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Without getting more details from internal and seeing queue status, its seems to me what is happening is that you have a large enough stream of data coming in that the instances cannot keep up withit at the indexing level.

I would disable all the inputs on the splunk-client's TAs and restart and see if you can connect and if _internal is getting events.

Typically, VMs are underprovisioned for lab environments.. (VMware with two instances on a desktop / laptop / sata 7200 drives?) Not enough iops could be an issue since you're getting a blocked queue.

And again, on the splunk-indexer, make sure you dont have outputs pointing to itself.

0 Karma

alancar
Explorer

It looks like it is the outputs.conf that is the culprit.

After removing outputs.conf from splunk\etc\apps\sendtoindexer\local folder, the problem disappeared after a reboot.

I think additional instructions need to be appended to the page below as it instructs to place outputs.conf into the local box:

http://docs.splunk.com/Documentation/MSApp/1.1.0/MSInfra/Createthesendtoindexerapp

Thanks for all the help!

malmoore
Splunk Employee
Splunk Employee

The docs have been fixed to say to move, rather than copy, the "Send to indexer" app from the apps directory to the deployment apps directory to prevent this 'forwarding loop' from being created.

esix_splunk
Splunk Employee
Splunk Employee

Your splunk config looks correct. According to that message, it seems that you're not getting a successful connection to the indexer at 172.16.1.1 port:9997.
What this is typically indicative of is a firewall blocking connections. If the indexer is on Linux, I would check and see if iptables is running and if that port is allowed through. (service iptables status OR service iptables stop)
If you're on windows, you need to check the server's firewall stats and config and config all inbound connections to TCP9997 are allowed.

0 Karma

alancar
Explorer

TcpOutputFd - Connect to 172.16.1.1:9997 failed. No connection could be made because the target machine actively refused it.
ERROR TcpOutputFd - Connection to host=172.16.1.1:9997 failed

0 Karma

alancar
Explorer

Here is the indexer inputs.conf content:

[default]
host = splunk-server

[splunktcp://9997]
connection_host = none

Here is the SplunkUniversalForwarder\etc\apps\sendtoindexer\local outputs.conf content:

[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
server = 172.16.1.1:9997

[tcpout-server://172.16.1.1:9997]

From the splunkd.log file, I keep seeing the following:

ERROR TcpOutputFd - Connection to host=172.16.1.1:9997 failed
WARN TcpOutputProc - Applying quarantine to ip=172.16.1.1 port=9997 _numberOfFailures=2

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!