Getting Data In

Prefential Load Balancing from Universal Forwarders to HF/EPs

jatkb
Engager

We are looking to deploy Edge Processors (EP) in a high availability configuration - with 2 EP systems per site and multiple sites. We need to use Edge Processors (or Heavy Fowarders, I guess?) to ingest and filter/transform the event logs before they leave our environment and go to our MSSP Splunk Cloud.

Ideally, I want the Universal Forwarders (UF) to use the local site EPs. However, in the case that those are unavailable, I would like the UFs to failover to use the EPs at another site.

I do not want to have the UFs use the EPs at another site by default, as this will increase WAN costs, so I can't simply list all the servers in the defaultGroup.

For example:

[tcpout]
defaultGroup=site_one_ingest

[tcpout:site_one_ingest]
disabled=false
server=10.1.0.1:9997,10.1.0.2:9997

[tcpout:site_two_ingest]
disabled=true
server=10.2.0.1:9997,10.2.0.2:9997

Is there any way to configure the UFs to prefer the local Edge Processors (site_one_ingest), but then to failover to the second site (site_two_ingest) if those systems are not available?

Is it also possible for the configuration to support automated failback/recovery?

Labels (1)
0 Karma
1 Solution

PickleRick
SplunkTrust
SplunkTrust

If you define multiple output groups, events are pushed to all of them at the same time (unless you override the routing per input or in transform).

If you have multiple destination hosts in an output group, they are handled in a round robin way. There's no other way using built-in mechanics.

You'd need to either use http output and install and intermediate http rev-proxy with health-checked and prioritized backends or do some form of external "switching" of the destination based either on some dynamic network-level redirects or DNS-based mechanisms. But all those are generally non-splunk solutions and add complexity to your deployment.

View solution in original post

PickleRick
SplunkTrust
SplunkTrust

If you define multiple output groups, events are pushed to all of them at the same time (unless you override the routing per input or in transform).

If you have multiple destination hosts in an output group, they are handled in a round robin way. There's no other way using built-in mechanics.

You'd need to either use http output and install and intermediate http rev-proxy with health-checked and prioritized backends or do some form of external "switching" of the destination based either on some dynamic network-level redirects or DNS-based mechanisms. But all those are generally non-splunk solutions and add complexity to your deployment.

gcusello
SplunkTrust
SplunkTrust

Hi @jatkb ,

usually connectione between Splunk systems are configured in autoloadbalancing so you have load distribution and failover management between the receiverse (both HFs or IDXs):

[tcpout]
defaultGroup=autoloadbalancing

[tcpout:autoloadbalancing]
disabled=false
server=10.1.0.1:9997, 10.1.0.2:9997, 10.2.0.1:9997, 10.2.0.2:9997

Otherwise I don't think that it's possible to have an automatic failover management.

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

New This Month in Splunk Observability Cloud - Metrics Usage Analytics, Enhanced K8s ...

The latest enhancements across the Splunk Observability portfolio deliver greater flexibility, better data and ...

Alerting Best Practices: How to Create Good Detectors

At their best, detectors and the alerts they trigger notify teams when applications aren’t performing as ...

Discover Powerful New Features in Splunk Cloud Platform: Enhanced Analytics, ...

Hey Splunky people! We are excited to share the latest updates in Splunk Cloud Platform 9.3.2408. In this ...