Getting Data In

Prefential Load Balancing from Universal Forwarders to HF/EPs

jatkb
Engager

We are looking to deploy Edge Processors (EP) in a high availability configuration - with 2 EP systems per site and multiple sites. We need to use Edge Processors (or Heavy Fowarders, I guess?) to ingest and filter/transform the event logs before they leave our environment and go to our MSSP Splunk Cloud.

Ideally, I want the Universal Forwarders (UF) to use the local site EPs. However, in the case that those are unavailable, I would like the UFs to failover to use the EPs at another site.

I do not want to have the UFs use the EPs at another site by default, as this will increase WAN costs, so I can't simply list all the servers in the defaultGroup.

For example:

[tcpout]
defaultGroup=site_one_ingest

[tcpout:site_one_ingest]
disabled=false
server=10.1.0.1:9997,10.1.0.2:9997

[tcpout:site_two_ingest]
disabled=true
server=10.2.0.1:9997,10.2.0.2:9997

Is there any way to configure the UFs to prefer the local Edge Processors (site_one_ingest), but then to failover to the second site (site_two_ingest) if those systems are not available?

Is it also possible for the configuration to support automated failback/recovery?

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

If you define multiple output groups, events are pushed to all of them at the same time (unless you override the routing per input or in transform).

If you have multiple destination hosts in an output group, they are handled in a round robin way. There's no other way using built-in mechanics.

You'd need to either use http output and install and intermediate http rev-proxy with health-checked and prioritized backends or do some form of external "switching" of the destination based either on some dynamic network-level redirects or DNS-based mechanisms. But all those are generally non-splunk solutions and add complexity to your deployment.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

If you define multiple output groups, events are pushed to all of them at the same time (unless you override the routing per input or in transform).

If you have multiple destination hosts in an output group, they are handled in a round robin way. There's no other way using built-in mechanics.

You'd need to either use http output and install and intermediate http rev-proxy with health-checked and prioritized backends or do some form of external "switching" of the destination based either on some dynamic network-level redirects or DNS-based mechanisms. But all those are generally non-splunk solutions and add complexity to your deployment.

gcusello
SplunkTrust
SplunkTrust

Hi @jatkb ,

usually connectione between Splunk systems are configured in autoloadbalancing so you have load distribution and failover management between the receiverse (both HFs or IDXs):

[tcpout]
defaultGroup=autoloadbalancing

[tcpout:autoloadbalancing]
disabled=false
server=10.1.0.1:9997, 10.1.0.2:9997, 10.2.0.1:9997, 10.2.0.2:9997

Otherwise I don't think that it's possible to have an automatic failover management.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...