Solved: Heavy forwarder stops forwarding when tcp syslog m...

brodieg · ‎07-15-2015

Hi,
I am successfully mirroring a filtered set of events at a heavy forwarder and sending them to a local TCP Syslog target (syslog-ng) and all other events on to the primary indexer on a different host (using [tcpout]).

When the local TCP Syslog endpoint is stopped, the primary indexer also stops receiving all events as well, even though it is healthy and unrelated to the tcp syslog endpoint. It appears as though the heavy forwarder doesn't like it when a configured TCP Syslog fails? It means my tcp syslog server has become a point of failure for all forwarding actions of the heavy forwarder? , or - more likely - I have got my config wrong 🙂

If the outage is short, (ie. the tcp syslog target is restored) all events appear in the primary indexer as if they were held back on the forwarder. Not sure what happens if there is an extended outage the the tcp syslog target though.

Any ideas on how I can make the forwarder a bit more resilient with TCP Syslog endpoint failure?

Thanks for any pointers!

GB

outputs.conf...

[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
disabled = 0
server = splunkindexer.network.internal:9997


[syslog]

[syslog:writetofiles]
server = 127.0.0.1:2514
type = tcp

transforms.conf..

[routeAll]
REGEX=.
DEST_KEY=_TCP_ROUTING
FORMAT=default-autolb-group

[syslogRouting]
REGEX=index.php\"
DEST_KEY=_SYSLOG_ROUTING
FORMAT=writetofiles

props.conf

[default]
TRANSFORMS-routing=routeAll,syslogRouting

[syslog]
TRANSFORMS-routing=routeAll,syslogRouting

splunkd.log

..this appears relevant from the forwarder:

07-15-2015 17:05:30.348 +1000 INFO  TailingProcessor - Could not send data to output queue (parsingQueue), retrying...
07-15-2015 17:10:55.765 +1000 WARN  TcpInputProc - Stopping all listening ports. Queues blocked for more than 300 seconds

alacercogitatus · ‎07-16-2015

The problem here is the TCP Connection. TCP connections require handshakes. If there are no handshakes, nothing can be sent. If nothing can be sent, the queues start to fill on the forwarder. If the queues start to fill, they will backlog all of the queues, including other tcpout settings. I had the same problem. The fix was to use UDP. This WILL cause data loss to your syslog-ng. BUT your primary Splunk instance will still have the data that is lost on the syslog out. It's a pretty simple config change.

[syslog:writetofiles]
 server = 127.0.0.1:2514
 type = udp

View solution in original post

alacercogitatus · ‎07-16-2015

The problem here is the TCP Connection. TCP connections require handshakes. If there are no handshakes, nothing can be sent. If nothing can be sent, the queues start to fill on the forwarder. If the queues start to fill, they will backlog all of the queues, including other tcpout settings. I had the same problem. The fix was to use UDP. This WILL cause data loss to your syslog-ng. BUT your primary Splunk instance will still have the data that is lost on the syslog out. It's a pretty simple config change.

[syslog:writetofiles]
 server = 127.0.0.1:2514
 type = udp

brodieg · ‎07-16-2015

Thanks! - Yes I had considered UDP as well, but the loss of data was something I was trying to avoid if possible.

I have done some testing with reversing the situation where TCP Syslog is working, but the downstream tcpout(indexer) host is not available - and similar behavior happens - i.e. TCP SYslog eventually stops . So this isn't a TCP Syslog only thing - its a case (as you say) of the forwarder not being able to successfully hand off events to a configured upstream target.

I have now have had a (better) read of the Admin manual regards forwarding and all the load-balancing options make a lot more sense now. - but you can't load-balance TCP syslog 😞

Thanks again for your insights!

GB

Heavy forwarder stops forwarding when tcp syslog mirror destination fails?

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk

Join the Conversation

Heavy forwarder stops forwarding when tcp syslog mirror destination fails?

Unlock Database Monitoring with Splunk Observability Cloud

Purpose in Action: How Splunk Is Helping Power an Inclusive Future for All

[Upcoming Webinar] Demo Day: Transforming IT Operations with Splunk