hi All,
HF's OS was recently migrated to RHEL from centos. Since then HF's are not sending any input data to splunk.
though i can see internal logs.:
I can see in internal logs error : Cooked connection to ip= <indexer ip> timed out
other error i can see is :
message from "/opt/splunk/bin/python3.7 /opt/splunk/etc/apps/splunk_assist/bin/uiassets_modular_input.py" splunk.AuthenticationFailed: [HTTP 401] Client is not authenticated.
need help as almost 9 forwarders are not reporting right now.
"splunk is not acknowledging the incoming syncs" - I'm a little puzzled here.
Do you mean that the TCP handshake is not performed properly?
What sequence of packets you see on the wire on the sending side and on the receiving side?
If you see only SYN from the client on the server and not see SYN/ACK sent back from the server the typical culprits are:
1) iptables (the packets are seen on the wire but not passed through the firewall to the socket) - but you said you checked that.
2) rp_filter - with multihomed hosts if the packets is received on the wrong interface if the rp_filter is on the packet is silently dropped. It's a tricky case because "everything seems to be in order" but it's not working.
These seem to be two separate issues.
The connection timeout errors are typically caused by problems on network level - misconfigured routing, firewalls or not-opened firewall ports on destination servers. You might want to diagnose that (telnet/netcat and tcpdump are your friends)
The other error is completely unrelated and seems to come from the Splunk Assist app which has nothing to do with your HF/idx connectivity.
And I am getting internal logs from hf. just not the syslog inputs that are configured there.
OK. So the connectivity as such is working and the HFs _do_ send data to indexers. It's just the receiving side that's not working.
Check the usual suspects - the firewall, the inputs... With UDP and multiple interfaces (or even a single interface and no default route) you could also hit rp_filter problem. Did you disable SELinux?
Well on further investigation everything seems to be ok. But tcpdump shows splunk is not acknowledging the incoming syncs somehow. Iptables are ok,inputs are there, ports are listening. All looks good. No errors in logs.
"splunk is not acknowledging the incoming syncs" - I'm a little puzzled here.
Do you mean that the TCP handshake is not performed properly?
What sequence of packets you see on the wire on the sending side and on the receiving side?
If you see only SYN from the client on the server and not see SYN/ACK sent back from the server the typical culprits are:
1) iptables (the packets are seen on the wire but not passed through the firewall to the socket) - but you said you checked that.
2) rp_filter - with multihomed hosts if the packets is received on the wrong interface if the rp_filter is on the packet is silently dropped. It's a tricky case because "everything seems to be in order" but it's not working.
SElinux is disabled, i think something has changed at firewall level. going to check that...
thanks @PickleRick i will come back and let you know how it went..
this started happening after the migration of these boxes which were centos7 to rhel8. but telnet seem to be working just fine. i can telnet to the indexers on 9997.