Getting Data In

Load balancing and queuing

Mahieu
Communicator

Hello there,

I'm deploying a Splunk environment with two forwarders and two indexers.
I've got a primary indexer and a backup one. The backup will be manually brought up if the primary goes down. So i'll have only one indexer up at any time. Both forwarders are up at all times.

In case of failure of the primary indexer, i'll need a human intervention to launch the second one. During this time, my forwarders will still be sending logs (and a lot).

To avoid losing logs, i'm using a queue on the forwarders. That's what my outputs.conf (on both forwarders) look like (141 is my primary indexer, 142 is the backup) :

[tcpout:192.168.100.141_9997]

server = 192.168.100.141:9997

maxQueueSize = 4GB

[tcpout-server://192.168.100.141:9997]

[tcpout]

defaultGroup = 192.168.100.141_9997,192.168.100.142_9997

[tcpout:192.168.100.142_9997]

server = 192.168.100.142:9997

[tcpout-server://192.168.100.142:9997]

As you can see, i can't set a queue for the backup indexer. I've tried but as the indexer is unreachable (that's the normal situation), the queue keeps growing and eventually takes Splunk down.

If my primary indexer crashes, here's what happens :
- meanwhile, the queue grows on the forwarders,
- i manually bring up the backup one,
- when the backup indexer is up, forwarders automatically send their logs to it (as per outputs.conf).

That's perfect BUT if i want to go back to the normal situation, no queue is configured for the backup indexer so i'll be loosing logs and that's exactly what i'm trying to avoid.

A solution would be to manually add the maxQueueSize = 4GB parameter to the [tcpout:192.168.100.142_9997] stanza but i'm pretty sure there's a better way to do that by modifying my outputs.conf file.

Any ideas ?

Thanks very much for your help in advance.
Let me know if the description of my problem isn't clear enough.

Mathieu

1 Solution

Mahieu
Communicator

I've tried this which seems to be much better

[tcpout]

defaultGroup = lb

[tcpout:lb]

server=192.168.100.141:9997, 192.168.100.142:9997

autoLB = true

maxQueueSize = 4GB

What do you reckon ?
Thanks in advance

View solution in original post

Mahieu
Communicator

I've tried this which seems to be much better

[tcpout]

defaultGroup = lb

[tcpout:lb]

server=192.168.100.141:9997, 192.168.100.142:9997

autoLB = true

maxQueueSize = 4GB

What do you reckon ?
Thanks in advance

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Thanks for the Memories! Splunk University, .conf25, and our Community

Thank you to everyone in the Splunk Community who joined us for .conf25, which kicked off with our iconic ...