We've currently got just one DS in our environment handling about 1100 forwarders. Performance has been pretty stable at this level, but I'm wondering what the cap is. We may be tasked with adding another 5000-6000 more UFs to capture logs from our workstations (thanks, Splunk, for removing Win7 support in 6.5+ by the way /s). I'm wondering how other admins balance their clients vs multiple (if necessary) deployment servers.
I'm wondering if it may be more feasible to configure Windows Event Collector and stand up a couple 2012 boxes, then collect using the UF/HF from there for this new set of clients.
We have tested with 5000 clients and with 30mins polling interval. Works smooth.
The only problem is the time taken to deploy the code config into all the configs (which may be Ok in most of the customers), but you need to think into it. A good link is this. So for 5000 clients, for a 50MB apps it takes 50mins to apply the code.
Also when you extend try to make the serverclass.conf may become unmanageable. WE do script it with csv files rather than putting wildcards to prevent conflicts
Can you please explain a bit how to use an intermediate deployment server.
This is a good point, and this is how we forward logs out of the DMZ, but I hadn't considered using them as Deployment Servers in our internal network (duh). Thanks for the additional info!
Off-topic side note: If you do use intermediary forwarders, make sure you have at least 2x forwarders than indexers to prevent potential issues with even event distribution across your indexers. Often, intermediary forwarding tiers introduce such issues, if not properly architected.
You can employ parallel pipelines on your forwarder to make them behave like multiple instances.
Would you still experience the uneven distribution of events even with forceTimebasedAutoLB enabled in outputs.conf on the intermediate forwarder?
We're rolling 6 indexers and have one IF in the DMZ for ACL reasons, and one internally as a syslog relay (with rsyslog). It doesn't appear to be an issue at this point (ingesting 250GB-300GB/day) but that doesn't necessarily mean it won't cause us issues as we expand.
we have tested DS performance on a 2 CPU box with 10000 deployment clients (10 Apps, 10 server classes). You should be fine, but a lot depends on how many apps and serverclasses you have and what your expectations are with respect to deployment times.
Many larger customers also tune their phoneHomeInterval to a larger number to help with Deployment server load.
We don't generally recommend going the WEC server route, because it requires custom processing to preserve the source hostname and causes some of our TAs for Windows-related apps to not work properly. I would stay away from that if you can.