Handling intermediary forwarders that are deployed...

OzUK · ‎04-20-2022

Hi all, new to splunk, we are regularly burning down our heavy forwarders and as such the IPs change regularly. I need a way to keep the UFs pointed at the HFs but ive read that using an AWS ELB isnt recommended.
To add to the challenge we have to keep everything encrypted over TLS.

what is the recommended way to handle ips changing all the time when managing hundreds of UFs?

how do people ensure that the UFs are always talking to the geographically nearest HFs?

many thanks

Oz

richgalloway · ‎04-20-2022

I'm not sure I have an answer, but I do have a few thoughts that may (or may not) help.

1. Avoid intermediate forwarders (IFs) unless absolutely necessary. They add complexity (especially in this case) and introduce failure points. Also, if not done well they can degrade search performace. Instead of IFs, have UFs send directly to indexers.

2. Use DNS. DNS was designed to abstract IP addresses.

3. Ensure each UF has more than one destination in outputs.conf. That way, if the IP address of one destination changes, there's another that can be used so data doesn't stop flowing. It also helps with the distribution of data across indexers.

---
If this reply helps you, Karma would be appreciated.

OzUK · ‎04-20-2022

Thanks Rich,

so we require the IFs as our on prem logs forward to them, and in the event of a failure they will cache the logs so we dont lose anything as we often cant cache them at source due to cpu and disk overhead.

each IF has 1Tb of storage for this function.

we have 6 HFs currently but will need to scale up in the future. So we wanted to use AWS ELBs infront of them to spread the load without having to reconfigure the UFs which live in some environments where we cant get access to the hosts to reconfigure. Plus we will have hundreds of UFs forwarding to the 6 IFs. The 6 IFs are deployed in 6 different regions currently for fail over, and UFs we want to configure to talk to only the geographically closest 2 or 3 ELBs.

The IFs get regularly rolled with terraform and ansible redeploying them as code which means the IPs rotate, and if we need to scale up we can have multiple IFs behind a single DNS entry of the ELB.

Surely someone must have dealt with this issue before?

we are coming from ELK and we had multiple logstash collectors behind single ELBs in that set up without issue and we could scale up and down the number of logstash instances on the fly.

If there is a better way to handle this please let me know. Really looking for guidance here as new to splunk. The learning curve is steep!

at peak we had about 55 logstash instances running behind 6 ELBs before.

TIA!

isoutamo · ‎04-20-2022

Hi

There are couple of ways to do it. The correct solutions is depending how you are deploy those IHFs and/or what you are preferring:

use additional interface with fixed IP which has configured your clients outputs.conf
use route53 to map ec2 node's ip to fixed name and use that name on your clients outputs.conf

Both works, but as I said it depends how you are creating those nodes on AWS select the best way to do it. As you are already using Terraform and Ansible for creating those nodes, it will be easy to add this functionality to your code.

AWS NLB sounds the easy solution for that and it could be work, but as @richgalloway told it is not supported by Splunk. If I have understood right there could be something on S2S protocol which can lead that this is not working always as expected? Some rumours have heard that LB has tested and maybe used in some places, but I haven' t any real information about it.

I'm not sure if this is something to look. As then rew 8.x.x UF's is supporting HEC as output channel, maybe one option is start to it and LBs for HEC? I haven't try that option by myself and I don't know which kind of new issues this can create, but at least it should supported by Splunk.

r. Ismo

OzUK · ‎04-26-2022

thanks for the help, i think we are going to go with a hybrid dns lookup with geo-ip routing to allow 1 dns name to route to the geographically closest heavy forwarder. we can use route 53 to update ips when we deploy new hosts or scale out with extra hosts.

Handling intermediary forwarders that are deployed as code

heavy forwarder

intermediate forwarder

universal forwarder

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!