Hi Gurus!
Here is the config first,
[ 50 Universal Forwarders : Total 300G of Data ] ==> [2 Load Balancing Forwarder ] ==> [ 5 Indexers ] ==> [ 4 Search Heads ]
From the above config we found our aggregation queue filled up on LB forwarder, which makes these 2 LB forwarders as bottleneck. What is the general rule of "data volume capacity per day" on a Splunk instance setup as Load Balancer?
I know depending on how the events are truncated and processed, capacity could be different, but what would be the general number of G per LB forwarder should process?
Appreciate your help!
There are a few more issues at play then just the network throughput here. There are few different things that come into play in regards to the HFs. Such as: Sourcetypes (parsing/aggregation requirements), disk i/o, memory, cpu.
You should look at your HF and see which queues are being hit consistently. If its the aggregate queues, you might want to consider adding another HF into you environment, or offload parsing of multiline events to indexers.. You can look at adjusting the autoLB time also, which will help offset some. Review your inputs and make sure you're not seeing errors in inputs that could be blocking or causing delays in the queues (ingesting tar / compressed logs files can make a huge problem..)
Another point to consider, if you are just ingesting single line log files like syslog, use a UF in parallel on the HF. The UF will process and send those files much faster then a HF will. Its quite common in large scale deployments to have HF's that run both the HF and the UF locally, and use deployment server to manage the inputs.
Tuning the OS is also another think to look at; memory, cpu and disk consumption. Depends on the environment. I've seen bare metal machines with high spec handling over 200gb a day as a forwarder.
There are a few more issues at play then just the network throughput here. There are few different things that come into play in regards to the HFs. Such as: Sourcetypes (parsing/aggregation requirements), disk i/o, memory, cpu.
You should look at your HF and see which queues are being hit consistently. If its the aggregate queues, you might want to consider adding another HF into you environment, or offload parsing of multiline events to indexers.. You can look at adjusting the autoLB time also, which will help offset some. Review your inputs and make sure you're not seeing errors in inputs that could be blocking or causing delays in the queues (ingesting tar / compressed logs files can make a huge problem..)
Another point to consider, if you are just ingesting single line log files like syslog, use a UF in parallel on the HF. The UF will process and send those files much faster then a HF will. Its quite common in large scale deployments to have HF's that run both the HF and the UF locally, and use deployment server to manage the inputs.
Tuning the OS is also another think to look at; memory, cpu and disk consumption. Depends on the environment. I've seen bare metal machines with high spec handling over 200gb a day as a forwarder.
Eric,
Thanks buddy!!
It is hard to have general number for forwarding data as it is totally depends on the network throughput. If you have faster network interfaces, it will be a greater number of data indexed. Does it make sense?
Matt, How about this?
For example with 300G per day, sometime aggregation queue gets filled, so it may be just valid to say 300G / 2 LBF = 150G per LBF. So in this particular environment, we can say 150G per day. Couldn't we say?