Deployment Architecture

Index cluster data imbalance with high vol data sources

brent_weaver
Builder

We are running in an index cluster with 53 indxers and are findind that our high volume sources cause data imbalance. We have this index cluster behind an AWS ELB. The high data sources seem "sticky" with what index they write to, therefore causing imbalances. I am looking to use index discovery in my next build in hopes this will mitigate some of this behavior and be more intelligent where the data writes.

This data source comes into an HEC tier and then off to the indexer. The index rebalance seems to be working fine but I want to try to avoid this issue as a whole!

Any thoughts are welcome. Thanks in advance!

0 Karma

mescober_splunk
Splunk Employee
Splunk Employee

If AWS ELB sticky sessions is enabled, subsequent http requests will land to the same indexer. HEC responses includes a cookie that's why.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

It sounds like you have an architectural constraint in your intermediary HEC tier.
How many Heavy Forwarders do you have receiving HEC traffic and forwarding to your indexers?
Ideally, you will need about 100 intermediary forwarding pipelines sending to your 53 indexer peers to prevent these data imbalance opportunities.

Also, do you have forceTimeBasedAutoLB enabled on your HEC forwarders' outputs.conf? Are you using the default autoLBFrequency?
You could enable forceTimeBasedAutoLB and lower your LB frequency. If you have a high-volume data source, the forwarder may not get an opportunity to switch indexers as frequently as you are expecting.

Finally, having an (E)LB between your forwarders and your indexers is not a supported deployment.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi @brent_weaver,

I don't have much more idea bout AWS ELB but splunk recommends not to use any 3rd party load balancer to send data between splunk instances but to send data from HF Tier -> Indexer Cluster use autoLB method which is splunk inbuilt auto load balancing method.

When you are running HEC on HF Tier then traffic flow should something liek this Application -> AWS ELB -> HF Tier -> Indexer Cluster.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...