Getting Data In

Blocking on indexers

New Member

I am seeing a lot of blocking on my three indexers, in the range of 500-1000 a day per host. The heaviest is indexqueue and typingqueue, followed by aggqueue. splunktcpin is in the double-digit range.
The indexes are striped across all three indexers. I'm at a loss on where to begin looking, anyone have this issue with blocking on their Splunk indexers?

0 Karma


I believe I've identified the root cause as slow disk for the COLD_DB. Our configuration is to have the hot/warm DBs on local attached (virtually, anyway) disks, and point the cold_dbs to a CIFs share on a NetApp... so index.conf looks something like this...


coldPath = \netapp\splunk\SplunkIndex02\DATA_2\databases\colddb

homePath = F:\CustomIndex\DATA_2\databases\db

thawedPath = \netapp\splunk\SplunkIndex02\DATA_2\databases\thaweddb

maxWarmDBCount = 32

So, I created another locally attached drive, and used it as the coldpath on ONE of the three indexers we have. After 4 hours, we have not seen ANY blocking on the indexer with the "locally attached" drive, while the other indexers continue to see blocking at the same rate as before. In this particular case, the slow disk was the cold db. If there a way to have splunk roll the files to cold on a schedule, rather than constantly.. this would not be a problem..

Path Finder

Yeah, I've had this happen. How many GB is each indexer handling daily? A safe number is 100GB.

0 Karma


More Info... I looked at a low indexing volume time (800MB/Indexer) and we still saw 28 indexqueue blocking event...

0 Karma


Aprox 30GBs/day... However, this even happens at substancially lower indexing volumes..

0 Karma


have you try increasing the queue maxSize in splunk/etc/system/local/server.conf:

# Queue settings
maxSize = [<integer>|<integer>[KB|MB|GB]]
        * Specifies default capacity of a queue.
        * If specified as a lone integer (for example, maxSize=1000), maxSize indicates the maximum number of events allowed
          in the queue.
        * If specified as an integer followed by KB, MB, or GB (for example, maxSize=100MB), it indicates the maximum
          RAM allocated for queue.
        *** The default is 500KB.**

maxSize = [<integer>|<integer>[KB|MB|GB]]
        * Specifies the capacity of a queue. It overrides the default capacity specified in [queue].
        * If specified as a lone integer (for example, maxSize=1000), maxSize indicates the maximum number of events allowed
          in the queue.
        * If specified as an integer followed by KB, MB, or GB (for example, maxSize=100MB), it indicates the maximum
          RAM allocated for queue.
        * The default is inherited from maxSize value specified in [queue]


More Info... I looked at a low indexing volume time (800MB/Indexer) and we still saw 28 indexqueue blocking event...

0 Karma


The interesting part is that if you look at disk queueing, disk response times and IOPs, there is not not much to indicate a disk bottleneck... Queueing is less than 1, RT is sub 20ms, and IOPS are less than 100... We tested the disks before installing splunk and we were able to reach upwards of 3000 IOPS... Of note.. these machines are virtualized, but are not sharing resources with other servers.. essentially dedicated from a Server AND SAN perspective...

0 Karma


Seeing that many messages a day, I would be concerned that the larger queue size would just delay the issue, since it seems it isn't getting the data output to disk quickly enough.

0 Karma

New Member

Thanks! I bumped indexqueue to 2000 and will look into increasing any others.

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...