Deployment Architecture

Load balancing intermediate forwarders

ialahdal
Path Finder

We have intermediate forwarders that receive data from UFs and then forward it to our indexer cluster that consists of 4 indexers, my issue is I see imbalanced data distribution and resource usage,
for example currently I am looking at indexing rate
indexer 1: 416K || CPU Usage 45%
indexer 2: 2.19M || CPU Usage 53%
indexer 3: 831K || CPU Usage 94%
indexer 4: 1.30M || CPU Usage 90%

current output.conf

[tcpout]
defaultGroup = indexer_cluster

forceTimebasedAutoLB = true

forwardedindex.2.whitelist = (_audit|_introspection|_internal)

[tcpout:indexer_cluster]
server = 0.0.0.46:0000, 0.0.0.47:0000, 0.0.0.48:0000, 0.0.0.76:0000

where 0s represent indexer IPs and ports

Labels (2)
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi ialahdal,
it isn't so clear what's your question...

Anyway, if you have four Indexers and two Heavy Forwarders, the HFs send data mainly to two Indexers (don'r aks me why because I did the same question to the Splunk Architecture training and I didn't have any answer!).
Instead Universal Forwarders correctly balance load between more Indexers.
So the solution to completely balance load (if possible for you) is to use four HFs for 4 Indexers.

Anyway, if you have an Indexer cluster it isn't a big problem because the Indexers with less indexing load are more available for searches so the total load is more or less the same.

Ciao.
Giuseppe

View solution in original post

mhoustonludlam_
Splunk Employee
Splunk Employee

Ideally you want to have twice the number of input pipelines as you have indexer cluster members.  If you are using two HFs for input, then these should be configured for at least 4 parallel ingestion pipelines per HF.  Depending on your HF hardware resources, you can run a large number of parallel pipelines (e.g., 8 or more) to get a more even data distribution.
Keep in mind like the other comments: if you have only one input stream then this will possibly not help you.  Line breaker and event breaker are you friends here too. 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi ialahdal,
it isn't so clear what's your question...

Anyway, if you have four Indexers and two Heavy Forwarders, the HFs send data mainly to two Indexers (don'r aks me why because I did the same question to the Splunk Architecture training and I didn't have any answer!).
Instead Universal Forwarders correctly balance load between more Indexers.
So the solution to completely balance load (if possible for you) is to use four HFs for 4 Indexers.

Anyway, if you have an Indexer cluster it isn't a big problem because the Indexers with less indexing load are more available for searches so the total load is more or less the same.

Ciao.
Giuseppe

ialahdal
Path Finder

gcusello

My main issue is about load balancing
It seems as if forceTimebasedAutoLB = true is causing an intermediate forwarder to focus sending data to one indexer and only shift to another one once the default value is met, could I also use autoLBVolume simultaneously to control the amount of data sent?

But then I also see in the output.conf documentation this:

  • The volume of data, in bytes, to send to an indexer before a new indexer is randomly selected from the list of indexers provided in the server setting of the target group stanza.
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi ialahdal,
it's what I said: one HF mainly send data to one Indexer.
forceTimebasedAutoLB = true could you help and I shouldn't use autoLBVolume I'd prefer autoLBFrequency =
I usually use default values and I let Splunk to manage itself, as I said when I have overloading I scale architecture.

Ciao.
Giuseppe

0 Karma

robday2390
Engager

I am investigating a similar topic. I came across something in the load balancing manuals that said load balancing doesn't take effect on tcp streams unless there is an EOF in the stream. There was a suggestion to look into the EVENT_BREAKER setting to perhaps address this.

I'm guessing that since the data arriving at the intermediate tier is a tcp stream that it's more difficult to load balance it.

0 Karma
Get Updates on the Splunk Community!

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...

Explore the Latest Educational Offerings from Splunk (November Releases)

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...