Hi guys,
I have a distributed environment in which there are a cluster of indexers and 3 heavy forwarders. Each HF has an add-on/TA which is used to get data by API http connection.
In this configuration, may I have duplicated events at the indexer?
What should I do?
May I install the add-on/TA on only one HF? In this case, I'd have problems with high availability...
Thank you in advance
laura
If multiple servers run the same API calls, you will end up with duplicate data, since they behave like scripted inputs.
A over-engineered solution I can think about, is installing the TA on a SHC, and creating a custom command the will do that API call, then do a scheduled search which calls that command. And since its a SHC, the search will only be called once from one of the search heads, then you HA and no dups.
You can create a SHC from those HWF, providing you have 3 of them.
This will require some work on your side, but it can be a interesting exercise.
Hi @lauraG85,
Yes there are possibility of duplicate data in your splunk instance and it totally depends on application from where you are fetching data in Splunk using Add-on. If application stores some checkpoint value, once those data is fetched by any Heavy Forwarder then you'll not end up with duplicate data.
I don't know which Add-ons you are using and what data you are fetching so in general if you do not want to install Splunk Universal Forwarder on server or due to limitation you can't install Splunk UF then you might use HTTP Event Collector (HEC) functionality to send data from application to Splunk server directly and you can setup load balancer to load balance data to multiple Heavy Forwarder for HA functionality.