Getting Data In

Make Syslog-ng Server HA with load balancing

Explorer

Hi all,

We'd like to make our syslog-ng server HA. Which is a heavy forwarder instance.

The plan is to clone our syslog server. Front both of the machines with the load balancer. And set it to active passive.

This way if the active syslog server experiences any issue's, we will have a script enable the data inputs on the passive machines and the load balancer switches the passive to active and pick up on ingestion where it left off.

Could this work?

0 Karma
1 Solution

Legend

Hi clozach,
I implemented Splunk syslog servers in many projects,
I always used two Heavy Forwarders running in active/active mode with a Load Balancer that distributes traffic between them and guarantees fail over feature.
Using active/active mode you don't have problems in switching.
Only one attentions:

  • when you configure Load balancer, set it in transparent mode, to have the original target IP address,
  • when you have to do something on Heavy Forwarders, make it one server at a time, in other words, don't use Deployment Server to manage them and manually set the configurations, to be sure that there always an Heavy Forwarder up and running.

Ciao.
Giuseppe

View solution in original post

0 Karma

Legend

Hi clozach,
I implemented Splunk syslog servers in many projects,
I always used two Heavy Forwarders running in active/active mode with a Load Balancer that distributes traffic between them and guarantees fail over feature.
Using active/active mode you don't have problems in switching.
Only one attentions:

  • when you configure Load balancer, set it in transparent mode, to have the original target IP address,
  • when you have to do something on Heavy Forwarders, make it one server at a time, in other words, don't use Deployment Server to manage them and manually set the configurations, to be sure that there always an Heavy Forwarder up and running.

Ciao.
Giuseppe

View solution in original post

0 Karma

Explorer

Hi Giuseppe,

Thanks so much for the insight.

How do you prevent duplicate data in this architecture? When considering it, I couldn't come up with a workaround to ingesting the same data. We have technologies configured to syslog data to our heavy forwarder (obviously lol), but in this architecture you would have to configure your tech to send logs to both heavy forwarder which would parse and send the same logs to the indexers.

Is this the job of the indexer cluster to prevent indexing of duplicate data? I think the same would apply for API pulls if these are configured identically.

I could be understanding something wrong, so correct me where needed.

Thanks again!

0 Karma

Legend

Hi Giuseppe,
as said by @nplamondon, you use the Load Balancer just to have continuity of work without a double indexing of events.
The flow is the following:

  • Your target server sends event syslog to the Load Balancer,
  • Load balancer sends event to one Heavy Forwarder that it founds active,
  • Heavy Forwarder sends event to Indexers.

Load Balancer has two jobs:

  • during normal work it distributes events to both the HFs but only one HF at a time not the same event to both the HFs,
  • during fault or maintenance of one HF it sends logs to the active HF.

Ciao.
Giuseppe

0 Karma

Explorer

Hi Giuseppe,

Any advice regarding my comment on 11/4? would really appreciate it.

Thank you,
Christian

0 Karma

Legend

Hi @clozach,
sorry for my delay, but I was out!
anyway, I' not an expert of API, what kind of API inputs are you speaking?
this configuration is for push inputs, in other words, the target servers send to a shared address (IP or DNS) messages (syslogs) that are distributed between HFs.

Ciao.
Giuseppe

0 Karma

Explorer

Hi Giuseppe,

No worries at all! The API's are modular GET requests.

I think I understand conceptually how this method works around syslog, but we would like to have those API inputs also HA.

If that requires a new machines, then so be it, but curious if it's possible in the architecture you've identified.

Thanks,
Christian

0 Karma

Legend

Hi @clozach,
concettually, if these APIs push events to an address (IP or DNS) you can use the Load Banancer to distribute traffic between active HFs and solve HA problems, but you have to see how they work.

It's different if these APIs work on a Splunk Server and pull data from an external appliance (e.g. BlueCoat) because you have to install them in one HF, so you haven't an HA architecture but only a cold solution.

I don't think that you need an additional server, but monitor your servers and see the workload on infrastructure (CPU and RAM) so you can tune your infrastructure eventually adding more power or a new server, but I shouldn't start with a new server.

Ciao.
Giuseppe

0 Karma

Explorer

Hi Giuseppe,

Would this be the same functionality for any API inputs we have on the same box.

With API inputs in active - active they would both request the necessary information and the load balancer wouldn't be able to differentiate it and would cause both heavy forwarders to send the same data for indexing.

Any idea how we can work around this?

Thanks again for all the help!

0 Karma

SplunkTrust
SplunkTrust

Your load balancer should only send each event to one host, so duplication shouldn't be an issue.

One thing to be mindful of is how your LB treats UDP traffic. I found that my F5s were fixating on one server unless I set the stream timeout to 0.

0 Karma