Getting Data In

Splunk with a load balancer?

Motivator

I'm at a client that is interested in knowing the abilities of Splunk to work behind load balancers.

  • LB on search heads: I assume traditional round-robin load-balancing would not work with search results being cached locally
  • LB on TCP inputs: For a periodic file upload into farm of Splunk indexers/LFs accepting input, I assume a LB would work
  • LB vs Splunk: I assume Splunk AutoLB works as well or better than a traditional LB if put between LWFs and Indexers

Are these assumptions correct? I would appreciate any other knowledge on this topic!

Splunk Employee
Splunk Employee
  • Load balancing on search heads won't really work because there is not a good way to replicate local state between search heads (search job results, private and public configs).
  • Unclear what you mean. When you say "TCP inputs", that implies that data is being streamed over a TCP connection. This may or may not work, but presumably only if the sending client does not split events across different TCP connections. On the other hand, you talk about uploading a file periodically, which is a completely different mechanism.
  • Regular load balancer will not work between a LWF and an indexer. You will have corrupted events. You must use autoLB from a LWF.

Legend

BTW, Splunk now has search head pooling - this addresses the problem of replicating local state. BUT the remaining problems still exist and are significant reasons to NOT use load balancers for any data inbound to an indexer.

Splunk Employee
Splunk Employee

You may want to consult our sales team as they can setup time with an engineer to help directly answer these types of questions.

0 Karma

Splunk Employee
Splunk Employee

Traditional load balancing with multiple search heads will work for search queries. You would need to make sure the session is "sticky". However, if you want to retain your user preferences between search heads there are some known limitations. A Splunk technical person should be directly consulted if you decide to go down this road. Items such as saved searches and reports, require synchronization between the search heads.

Load balancing a network input for most use cases should work fine. This would include syslog or similar input.

Using a Splunk Forwarder to load balance between indexers is preferred. Advantages include access to metrics, queueing of data, and input tracking.

Splunk Employee
Splunk Employee

Does Splunk have any recommendations and/or config samples for load balancers should one be required?

0 Karma