Does a Heavy Forwarder have the same limitation as a Universal Forwarder in bold below?
Please note, I already know that syslog or sylog-ng is preferred over a direct network input for Splunk.
From:
http://docs.splunk.com/Documentation/Splunk/6.1.3/Forwarding/Setuploadbalancingd
Important: Universal forwarders are not able to switch indexers when monitoring TCP network streams of data (including Syslog) unless an EOF is reached or an indexer goes down, at which point the forwarder will switch to the next indexer in the list. Because the universal forwarder does not parse the data and identify event boundaries before forwarding the data to the indexer (unlike a heavy forwarder), it has no way of knowing when it's safe to switch to the next indexer unless it receives an EOF.
from your linked article:
Because the universal forwarder does
not parse the data and identify event
boundaries before forwarding the data
to the indexer (unlike a heavy
forwarder), it has no way of knowing
when it's safe to switch to the next
indexer unless it receives an EOF.
So, you want to dedicate a box to HF and not use syslog or syslog-ng? If so, you'll be alright.
I would suggest though that for the small overhead you get large benefits from seperating syslog handling from Splunk.
Consider this: If you want to add a syslog sourcetype or new parsing, you need to restart the HF. This will inturrupt your syslog data flow, reducing that activity to maintenance windows only.
However, if you have syslog-ng write to a local file and have HF read-parse-forward, then adding new data types is a config change on syslog with a smaller restart and parsing changes can happen on the fly at will as a HF restart will only delay, not drop, data indexing.
Luckily, I've usually got a syslog-ng host to work with, however I may run into a scenario where I'm only allowed to use a Heavy Forwarder with a network monitor input.
That's why I added the "please note" in the question.
As far as I'm aware, the heavy forwarder does not have the same limitation - it can make use of the parsing queues. I've used it in environments where we were receiving on udp ports and scraping from fifo's.
Not really finding much to actually back this up, however. There's a bit more on the wiki on what parameters are declared where:
http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F