I have a ticket in with support but this may be faster.
My intermediate forwarder is not working right. When I restart it, everything works for a few minutes then stops working. I have checked everything that I know to help.
Please help with suggestions. 600 systems are down!!!
Splunk Support was relatively quick to respond. Rajpal Bal got on the line and at my request she quickly setup a webex. we looked at SOS and could see that the tcpout on the Intermediate forwarder (IF) was full and the tcpin for the indexers was very low. on Thursday there was a mix up in DNS but this did not affect the IF until Splunk was restarted yesterday. Rajpal suggested and helped me to add the connection_host entry below to the inputs.conf to force Splunk to use IP and not look-up DNS names. we did this on both the IF and the indexers. it did not immediately resolve the issues but over a few hours the IF started its normal behavior and we can fix DNS on Monday.
Thanks Rajpal, for fast, appropriate and extra effort in staying beyond work hours to solve this tricky problem
in the inputs.conf that has the "splunktcp" stanza the "connection_host = ip" for app ports like below
[splunktcp:://]
connection_host = ip
Splunk Support was relatively quick to respond. Rajpal Bal got on the line and at my request she quickly setup a webex. we looked at SOS and could see that the tcpout on the Intermediate forwarder (IF) was full and the tcpin for the indexers was very low. on Thursday there was a mix up in DNS but this did not affect the IF until Splunk was restarted yesterday. Rajpal suggested and helped me to add the connection_host entry below to the inputs.conf to force Splunk to use IP and not look-up DNS names. we did this on both the IF and the indexers. it did not immediately resolve the issues but over a few hours the IF started its normal behavior and we can fix DNS on Monday.
Thanks Rajpal, for fast, appropriate and extra effort in staying beyond work hours to solve this tricky problem
in the inputs.conf that has the "splunktcp" stanza the "connection_host = ip" for app ports like below
[splunktcp:://]
connection_host = ip
follow @martin_mueller advice and check the servers ulimit settings; usually if something works for a few minutes and then stops on *nix systems, indicates ulimit being too low.
Usually there would be some indication of what's wrong in the IF's internal logs, especially splunkd.log.