Hello Splunk community.
I have been searching for this question quite a lot and went through many articles, but it’s still unclear to me.
Can someone please explain when would we want to use heavy forwarder instead of universal forwarder. I would really appreciate a real use case, when in order to get data into splunk we would want to go with heavy forwarder instead of universal forwarder, and why.
Thanks in advance, for spending time replying to my post ❤️
I'm sure you know already that the Universal Forwarder just forwards data from files, from event logs, or from scripts.
Some example scenarios where you would want a heavy forwarder include:
* You are collecting logs using apps like DBConnect, Salesforce, HTTP modular input, etc. (These apps tend to be managed using the web interface, so a heavy forwarder is better)
* You would like to perform parsing operations on data before it is indexed. E.g. you might want to send certain data to one indexer cluster and other data to another indexer cluster.
* You would like to collect events using the HTTP Event Collector (HEC), but you don't want to expose the HEC interface of your indexers.
Thank you for your reply, marnall.
I have some additional questions to your scenarios.
1. Why the fact that the apps are being managed by web interface makes it better for us to collect logs using heavy forwarder? For example, I have MSSQL database, from which I am collecting some data from the tables directly from DBConnect and I don't need any kind of forwarder in order to get my data into Splunk, why would I want to use a heavy forwarder?
2. "you might want to send certain data to one indexer cluster and other data to another indexer cluster."
Does it mean that this kind of operation is impossible on the universal forwarder? Also what are the benefits of parsing data before it's indexed? Does it mean that when we do the "fast" mode search we will see the fields that were extracted by the HF?
3. I didn't work with HEC, so I am sorry if it's a very simple or dumb question, but, what does it mean to "expose the HEC interface of your indexers"? Also why would we want to avoid that?
I am only 1 month with Splunk, so I am sorry in case I am complicating things 😄
Thank you for your time, marnall!
1. In order to run DB connect you need to run it on a Heavy Forwarder, as it contains many component’s that are pre-requisites.
Use the below link for more details
https://docs.splunk.com/Documentation/DBX/3.16.0/DeployDBX/HowSplunkDBConnectworks
2. In short yes, Splunk has in built functions to be able to send data to different destinations, using the UF, so simple example, if you have Splunk on premise and Splunk in cloud, you can send to both if desired. Parsing the data, has performance gains if going via the HF, it will examine the data, and transform it, there are many sub parts to the pipeline process. In terms of the fast mode when you parse data before indexing, the extracted fields are available for use in searches, regardless of whether you're using fast mode or not, the fast mode is one of three modes, allows you to search for available data using a different criterion.
See the three below links for more details:
https://docs.splunk.com/Documentation/Splunk/9.0.4/Forwarding/Routeandfilterdatad
https://docs.splunk.com/Documentation/Splunk/9.2.1/Deploy/Datapipeline
https://docs.splunk.com/Documentation/SplunkCloud/9.1.2312/Search/Changethesearchmode
3. If you data source can only send API data to Splunk, then this is a good option (it’s basically agentless) and called the HTTP event collector.
https://docs.splunk.com/Documentation/SplunkCloud/latest/Data/UsetheHTTPEventCollector
HF's are a full Splunk instance, the UF is like an agent.
We mainly use the HF if we want to ingest data via a Technical Add-on that uses modular inputs using python etc, or do you want to forward data, or parse / mask the data before it's sent to Splunk indexers (So these are some of its use cases for a HF)
The UF can do some parsing for some common data formats but can also be used for forwarding, but mainly its used to collect logs.
So, think about your use case, example, do you need to collect some logs of logs like AWS, so that would be better use a HF and forward the data. (You can use the SH but then you may need more resources)
For the UF - are you just collecting logs, or you want to forward some data on.