According to Splunk Validated architecture of designing HA between 2 syslog server
the documentation says this --> Using a network load-balancer (L4/L7) to provide a highly available syslog collection infrastructure is
discouraged due to potential data loss and fidelity concerns (IP lost in proxy). The recommended approach is
to use a cluster resource manager solution (e.g. pacemaker/keepalived) or a VM-based syslog server
deployment instead. Splunk does not have any insight into available technologies at your company, so please
engage the in-house experts in your network team to design and deploy such an HA solution.
did anyone try this or have another common practice approach?
hi @Wohamed_wakkad , it looks like you have 4 helpful replies from Splunk Trust members already. I will just add the following.
Load Balancer Method (in front of syslog)
- Challenges around SRC-NAT, output to IP not FQDN makes it harder later
- You can resolve this through routing table entries, host files or other options such as F5 irules but there is some complexity to manage
Keepalived method
- VRRP solution, linux standard offering that is popular and easy to implement
- I would also recommend the Rsyslog Assistant for writing a good config that will output to folders etc
- I have used this at 3 clients in the past couple of years and it is long term reliable and straight-forward
I replied to you already in the other thread - https://community.splunk.com/t5/Deployment-Architecture/load-balancer/m-p/760346/highlight/true#M299...
Anyway, as I said there - no syslog solution will ever be 100% reliable. So you're asking the wrong question. The right one would be what do you want to secure yourself against, what is your appetite for risk and so on.
We have used F5 in front of our syslog servers, but as said you must know how to configure F5 to use e.g. correct profile. Otherwise you will definitely lost events. Our load was couple of billion event per day and in your tests with correctly configured F5 we don’t lost any events. With incorrectly conf we lost lot.
It guess it depends on your architecture and hosting arrangements but I have used keepalived for this purpose (as well similar situations like operating a HA loadbalancer on a single IP) where we essentially create a virtual IP (VIP) which is held by the 'alive' host and it snatched off it by another host if the original host fails its healthcheck.
If you are using AWS services then there may be other approaches to this though. Typically I see connections to syslog servers being persisted to a single IP which then handles all the incoming data, its usually hard to setup a source to load balance across multiple syslog receivers which is why you need a highly available endpoint like this.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Hi @Wohamed_wakkad ,
only one additional hint: don't use Splunk network inputs but a syslog receiver (like rsyslog) writing on a file that's read by Splunk.
This requires more disk resources but gives you more availability because it runs also when Splunk is down.
I don't like SC4S,
Ciao.
Giuseppe