Is it advisable to deploy heavy forwarders to all clients vs universal forwarders? We have an interest in cutting down on the amount of data indexed and being transmitted across our network. Should I expect any large performance degradation locally with 400+ heavy forwarders "doing their thing"?
No it is not, at least in most cases.
Universal forwarders are by default designed to have a low level of resources usage on clients, and will limit network bandwidth usage. (256KBps by default)
On the opposite, heavy forwarders are full Splunk instances and will use be very useful in specific cases such as:
- acting as intermediate collectors to address your network and improve your Splunk architecture (isolating indexers from the rest of your network, dedicating indexing parsing tasks to heavy forwarders instead of having these tasks done by indexers, improving scalability and so on)
- Managing central points of data such as NFS shares, Syslog severs, indexing data from databases and so on
This is why for "normal clients" you will use Universal forwarders, for specific usage you have a real and great interest of udon heavy forwarders.
Heavy forwarders is a very powerful piece of Splunk architecture scenarios, but not for any usage.
Note that one of most important difference between UF and HF to remember is the fact that UF won't be do any indexing parsing tasks, while HF will do full indexing time parsing.
Hope this helps.
Some related schemas:
Finally, what you should consider to control and optimize network bandwidth is having HF forwarding data from your UF clients in the same vlan,
You will that way optimize and control the data flow. (Controlling data flow from a few endpoints will be much easier than from 400 clients).
Note that UF and HF can compress data flow using zlib or ssl, which also participate to optimize network usage
The reason I was exploring Heavy forwarders for my clients vs. universal forwarders is fear of blowing my license quota. At the moment we are only interested in (15) Windows Event Codes and a the security logs for several RHEL 5 servers. Long term we plan to greatly expand our utilization of Splunk for its monitoring and client management abilities.
If I had all universal forwarders deployed, would the license usage occur as we are forwarder to the indexers or would does it occur at the indexer? FYI our clients have more than enough processing power to handle heavy forwarders.
Thank you for your input.
You are correct: you can use heavy forwarders to filter data before it hits your indexers, that is one of the primary reasons to use them.
David Paper did a presentation at .conf last year that illustrates some deployment scenarios and tips you might want to consider. See "Getting The Most Out of Your Splunk License: Keeping the Junk Out of Splunk."
From what I understand from reading the Splunk documentation and some of the answers here, using the Heavy Forwarder affects local performance only and should not affect performance of my indexers. Am I correct to assume this?
Yes, the heavy forwarder can itself parse and index data separately from your true indexers.
Take note of lguinn's point about the filtering capabilities of the current universal forwarder, too. I forgot about that. 🙂
Only the data that get indexed costs to your licence, using HF won't change anything in your licence usage.
Personally i strongly recommend designing architectures with Heavy Forwarders for reasons exposed above, as Chris explained too, you can also greatly prevent from indexing junk data at HF level, and reduce licence usage, limit bandwidth and so on.
Because HF do parsing at indexing time, bad events can be send to null queue before they address indexers, and any other parsing tasks will be acheived by HF instead of indexers (which will let more power for other indexing indexer tasks)
This is a question of architecture design, HF are a perfect answer to answer complex networks for example with multiple vlan and address others cases exposed.
Thanks for the input. I am interested in specific information and do not want flood my indexers with unneeded data. I have been running a test implementation with 48 Heavy Forwarders and have not noticed any degradation in performance. We are about to move into distributed environment for the final implementation and plan on using HFs.
Thank you for the input.
In Splunk 6.x, you can restrict the event codes that are forwarded, even on a Universal Forwarder:
Monitor Windows Event Log Data
Near the bottom of the page, it shows how to use whitelists and blacklists in inputs.conf to forward only the event codes that you want. This page also has links to related and useful information on collecting Windows data.
The plan in my shop is to continue our plan with deploying Heavy Forwarders to our Clients. I would like to control the amount of "junk" that ends up at my indexer. Based on several months of testing with heavy forwarders I have noticed any degradation of performance on my local machines running these.