Unfortunately, I don't have a solution to this issue. I've been having this issue for some time now and I still haven't identified the root-cause or a solution to resolve this. I've already engaged Splunk Support to assist with troubleshooting and have tried different configuration changes -- from increasing the ulimit, to changing the maxKBps throughput, to changing the MAXEVENTS settings. However, the issue persists. The weird thing is that this issue is only happening on half of the indexers (5 newly added indexers.) There are 10 indexers total in our Splunk infrastructure, 5 old indexers and 5 newly added indexers for expansion. The hardware specs of the servers are almost identical. But for some reason the "Saturated Event-Processing Queues" only happen on the 5 new indexers. Whenever this happens the affected indexers are still searcheable but the indexing stops and the load is distributed to the rest of the healthy indexers.
For now, the band-aid approach is to restart the new indexers whenever the alert is triggered. This has become an annoying and painful daily process. I'd really appreciate it if someone out there have encountered this issue, successfully identified the root-cause and resolved this issue.
Thank you very much.
... View more