Getting Data In

What is causing the following warning from the Monitoring Console's Health Check?: "Saturation of event-processing queues"

tcmarquesi
Explorer

Monitoring saturation of event-processing queues in Heavy Forwarders

I have a distributed environment with multiple indexes, search heads, and a pair of heavy forwarders. But over the last few days, one of my heavy forwarders started to alert a issue. The Monitoring Console's Health Check is warning "Saturation of event-processing queues". Besides that, the heavy forwarders performances have decreased a lot, delaying event delivery and failing scripts execution. splunkd is consuming 100% of its CPU core full time.

Checking docs (Identify and triage indexing performance problems), they suggest to determine queue fill pattern through the Monitoring Console > Indexing > Indexing Performance: Instance. But, seems it applies only to the indexers, not to the heavy forwarder.

Please, how could I discover what is causing such issue? How could I monitor such an issue? How can I see when it starts and how long it takes in order to do a cross with other systems behavior? Is such info available in the Monitoring Console?

Thanks in advance and regards,

Tiago

gjanders
SplunkTrust
SplunkTrust

in Alerts For Splunk Admins I have an alert called IndexerLevel - Indexer Queues May Have Issues (refer to the github location if you don't want to download the app)

The monitoring console covers this under Use the monitoring console to view indexing performance

In terms of finding a cause there are various posts on the answers site, try this google search for a start

0 Karma

ddrillic
Ultra Champion

You should try to find out the cause obviously, but keep in mind that the default queue sizes is tiny. We ended up with the indexers to have something like at $SPLUNK_HOME/etc/system/local/server.conf -

[queue=AEQ]
maxSize = 200MB

[queue=parsingQueue]
# Default maxSize = 6MB
maxSize = 3600MB

[queue=indexQueue]
maxSize = 4000MB

[queue=typingQueue]
maxSize = 2100MB

[queue=aggQueue]
# Default maxSize = 1MB
maxSize = 3500MB

This buffer of memory helped us to remain stable during peak usage time.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

In cybersecurity, defenders respond to threats. Architects design the systems that stop them.    As ...

Best Practices: Splunk auto adjust pipeline queue

When you enable autoAdjustQueue in Splunk, maxSize should be understood as the queue size Splunk starts with ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...