I need to forward from a few Universal Forwarders to some third party indexers, and I'm not that keen on actually using the endpoint IP's in the UF config. I'd rather connect to a single local vip on a load balancer and let it worry about it. Is there any risk in doing so? I'd be monitoring the endpoint health with frequent TCP checks, but past that the UF would only have one IP to connect to. If it gets a broken connection, will it mark that one endpoint as unavailable, even if it has valid connections passing through to other remote endpoints behind the vip? Can I configure it to have, say, 4 different connections concurrently, with the UF oblivious to the fact that there are multiple indexers receiving those data streams?
There is huge risk in doing this. The UF-to-indexer connection has some intelligence. Arbitrarily routing packets with an external load balancer will probably break the parsing of data into events.
Don't do it.
I understand that you don't want to maintain a list of indexers on every forwarder. It can be ugly and painful. Here are a few alternatives:
DNS name list - set up DNS so that a single name (eg. indexer.myco.com) points to multiple indexers. Put only the DNS name (indexer.myco.com) in outputs.conf on the forwarder. PRO: a single entry in outputs.conf CON: can put a lot of load on your DNS servers.
Use the deployment server to manage outputs.conf on the forwarders. PRO: you can manage outputs.conf from a single location and ensure it is the same on every forwarder. CON: you must set up the deployment server (unless you are using it already).
Oh for a working DNS server in this environment! that'd be lovely!
It wouldn't be arbitrary packets though, this is TCP level, there's no chance of half messages being passed to the wrong indexer as long as they are sent on a single TCP stream which I presume is a given. What is this intelligence? Are there documents to explain it? If we Round Robin around a few different indexers with DNS, how is there more risk by RR-ing with a load balancer? I see that there is a difference that in it's connection pool the UF will see different IP's it's connected to, but I would wonder if there is any actual consquence to that, a connection is a connection, whether it's the same IP or not?
When the forwarder makes its TCP connection with one indexer, it will often send multiple packets over the connection before switching to another indexer. The forwarder could easily split an event between packets - since an event can consist of multiple lines. In fact, the forwarder does not really know where the event boundaries lie. So the forwarder continues to send to the same indexer until it hits EOF on the data stream that it is monitoring. Then it switches.
There are variations, This could change and it isn't formally documented. But this is the short explanation of "why."
So there is a consequence. Again, I advise against this, and so will every other Splunk consultant.
Also, I feel your DNS pain...