Single forwarder to multiple indexers load balanci...

guillaume_puyo · ‎01-23-2015

Hi everyone,

Implementing Splunk for the first time in an enterprise environment, I read a lot of documentation about the product but there is one aspect that I'm missing and I was hoping you could help me with it :

Splunk balances the load by switching the flows from the forwarders to different indexers, the switch is made over time to any indexers available.

One of the forwarders we have is a huge syslog-ng receiver that centralizes a lot of data from many sources, I'm worried that when the flow from this forwarder hits one of the indexers, it may overwhelm it as in this scenario switching indexers wouldn't help. How does splunk handle such situations, is it even possible to overwhelm an indexer ? is it something we need to deal with at the source ? like splitting flows ?

Thanks a lot !

lguinn2 · ‎01-23-2015

There are folks who have done exactly what you want. Hopefully several of them will respond and give you a variety of ideas.
First, I don't know why load balancing wouldn't work. I would set forceTimebasedAutoLB = true as well. This setting is helpful for files that are very active and potentially very large.

One of the things that can bog down a forwarder is monitoring "too many" files. I don't know if there is an absolute limit to the number of files that can be monitored, but my experience says that monitoring over 5000 files can bog down a forwarder, causing it to consume lots of memory and CPU and to run poorly. In this case, it might make sense to "split flows" by having two Splunk forwarder processes running - each monitoring a non-overlapping part of the log files. However, the best solution is often to simply move inactive log files from the monitored directory. The fewer files that Splunk must monitor, the better for performance.

I don't know if you can "overwhelm" an indexer exactly, but you can overload it. Indexer load depends on both the quantity of inbound data and the amount of searching. To some extent, the inbound volume is naturally throttled by network bandwidth in many environments. I don't get many questions about this, so I assume it is rare. In any case, the best solution that I know is load-balancing. And if the indexers are overloaded, add another indexer!

HTH

guillaume_puyo · ‎01-23-2015

Many thanks 🙂
If my forwarder is going to send so much data that it will overload an indexer peer node, the load balancing making it switch from one to another (they have the same power) won't help much I presume, so adding nodes might not help, having separate forwarder processes like you mentionned looks like a solution since they wouldn't hit the same indexer node

lguinn2 · ‎10-06-2016

Also, two interesting Splunk blog posts on syslog:

http://blogs.splunk.com/2016/03/11/using-syslog-ng-with-splunk/
http://blogs.splunk.com/2016/05/05/high-performance-syslogging-for-splunk-using-syslog-ng-part-1/

And there are other articles, etc. about the same thing...

a212830 · ‎01-23-2015

Have you considered spreading the workload on your syslog-ng server by adding a server and putting a load-balancer in front of them? That is a common setup...

Single forwarder to multiple indexers load balancing

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Deep Dive: Optimizing Telemetry Pipelines in Splunk Observability Cloud

Announcing Modern Navigation: A New Era of Splunk User Experience

Data Drivers: How We're Streaming Real-Time F1 Telemetry Directly into Splunk ...

Join the Conversation