Getting Data In

Heavy forwarder setup - looking at clustering or other HA options

Aiders1
Observer

Hi all,

I'm new to this forum and found quite a few ideas and solutions to issues admins hit.

The organisation I work for are standing up a new site and requested new pair of heavy forwarders to be installed.

The issue we have been mulling over is how to provide a highly available forwarder cluster at this site.  The forwarders will be based on Linux, will process data from the network (Syslog, netflows etc) and also process files located on a NFS share (service provider managed CIFS/NFS share).

We are using Splunk Cloud but have a deployment server on-prem to manage forwarders on the internal networks.

My question - is there a solution to provide a clustered pair of forwarders that act in an active/passive cluster that allows support for processing files and also accepting network traffic?

cheers

aiders

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Without using some external solution, you don't have the option to "pair" forwarders and have them monitor the same set of files.

You can monitor them independently from two different forwarders but then you'd obviously have duplicated data.

So a layer of two or more heavy forwarders will give you horizontal scaling and failover capability but this happens _after_ your initial ingestion point (usually UF's).

HF's in this setup are highly available (active-active) but only considering data forwarded from their initial collection point. You can't have "failoverable" inputs on them. It's the outputs logic on the previous layer that does all the work.

0 Karma

SanjayReddy
SplunkTrust
SplunkTrust

Hi @Aiders1 

I Would recomand  you to  setup heavyforwarders in differnet sites with active--active configuration.

so that if few HFs goes down other can able to accept the data. 

for getting network data , you can enable port on all HF to get data example: enable port 2048 TCP on HF

and provide all HF forwarders IP on syslog config of device.

OR
you can create F5 load balancer IP which contains all the HF IPs, Load balancer can forwarde the data to HF based on avalibilty , you need configure only F5 Load balancer IP on Syslog device 

and on servers where you want to want get the data indtall UF add following configuration which can load balance data between forwarders 

[tcpout]
defaultGroup = HF

[tcpout:HF]
server = forwarder1:9997, forwarder2:9997

you can refer to followinf links

https://docs.splunk.com/Documentation/SplunkCloud/latest/Data/Monitornetworkports
https://docs.splunk.com/Documentation/Splunk/8.2.2/Data/Configureyourinputs 

0 Karma

fatmaBouaziz
Loves-to-Learn Everything

Hello,

Did you find a solution to your architecture?

I want to do the same thing to my client with HFs cluster ACTIVE-ACTIVE . I have DBConnect installed in my HF and i don't want to replicate the data.

Can you please help me?

BR.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

As I wrote before - there is no such functionality. Each HF is a separate entity and there is no synchronization of state or antyhing between them. So there is no way to:

1) Make sure that given input runs only on one of the forwarders at a time

2) In case of inputs that need to keep state - let the other forwarders know about the state captured on a forwarder.

So a HF can be a part of a "cluster" as long as it means only being in a path of events forwarder from elsewhere (like UF-HF-indexer). In such setup the HF works pretty much statelessly so the UF can easily load-balance between available HFs. This architecture has its drawbacks though so in general you're advised to send events directly to your indexer layer, without the intermediate HF.

0 Karma

fatmaBouaziz
Loves-to-Learn Everything

Hi, thank you.

However, i can't directly send my data to indexers since we use Splunk Cloud. That won't be 100% secure.

Does DBconnect works well in a SHC? Can't I configure HFC as a SHC?

BR.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

"Secure" can be defined in many ways depending on context so I'm not going to argue here with you because I don't know the circumstances. Usually properly TLS-secured connections are secure enough. But YMMV.

Anyway, I don't understand where you're going with this SHC/HF question. How would you want to run your modular input (as I understand - against your local database server) from the Cloud environment?

It seems you have some strange and a bit tight requirements which needs some engineering not on a "general architecture" level but rather with regard to your particular situation.

Anyway, from the technical point of view you could run your inputs on a SH but it's not a supported and advised architecture. And it will not solve your "clustering" needs. The input would run completely independently of the SH clustering mechanics so if you defined it on one of the SHC members, it would run only there. If you defined it on more than one SH, it would run simultaneously on all of them. The search scheduler has nothing to do with inputs.

You could try to install dbconnect on the searchheads and run a scheduled search calling dbquery and ending with collect but that's so wrong on so many levels...

0 Karma

Aiders1
Observer

Hi,

Thanks for the details, we are trying to steer away from F5/netscallers for load balancing network traffic and forwarding onto HF's cluster.

thanks

Aiders

 

 

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...