Splunk Search

I disabled global metadata on some indexers and now my indexing queue is blocked. Why?

DerekB
Splunk Employee
Splunk Employee

We upgraded from 4.2 to 4.3.5 because we had a sources.data that was many GB in size. To resolve this, we tried to upgrade to 4.3.5, which has the ability to disable global metadata. When we did this, the behavior described in title started. Why did this happen, and what can we do about it?

1 Solution

jbsplunk
Splunk Employee
Splunk Employee

This is a bug, tracked as SPL-60031, is to be fixed in 4.3.6. This isn't a problem in 5.0. If global metadata is disabled, the marker file (.repairedBucket) is not being cleared as this file gets cleared inside DatabasePartitionPolicy::rebuildMetaData. This causes the bucket manifest to being regenerated very frequently(several times a second in some cases). It results in the indexing queue blocked and filling, causing backups into all other queues.

A workaround for the issue can be implemented by modifying this setting in indexes.conf:

serviceMetaPeriod = <nonnegative integer>

    Defines how frequently metadata is synced to disk, in seconds. After changing and restarting the indexer, we no longer saw blocked index queues. 
    Defaults to 25 (seconds).
    You may want to set this to a higher value if the sum of your metadata file sizes is larger than many
    tens of megabytes, to avoid the hit on I/O in the indexing fast path.
    Highest legal value is 4294967295

Changing it from 25 to 150 seemed to help quite a bit with the case I saw of this behavior.

View solution in original post

jbsplunk
Splunk Employee
Splunk Employee

This is a bug, tracked as SPL-60031, is to be fixed in 4.3.6. This isn't a problem in 5.0. If global metadata is disabled, the marker file (.repairedBucket) is not being cleared as this file gets cleared inside DatabasePartitionPolicy::rebuildMetaData. This causes the bucket manifest to being regenerated very frequently(several times a second in some cases). It results in the indexing queue blocked and filling, causing backups into all other queues.

A workaround for the issue can be implemented by modifying this setting in indexes.conf:

serviceMetaPeriod = <nonnegative integer>

    Defines how frequently metadata is synced to disk, in seconds. After changing and restarting the indexer, we no longer saw blocked index queues. 
    Defaults to 25 (seconds).
    You may want to set this to a higher value if the sum of your metadata file sizes is larger than many
    tens of megabytes, to avoid the hit on I/O in the indexing fast path.
    Highest legal value is 4294967295

Changing it from 25 to 150 seemed to help quite a bit with the case I saw of this behavior.

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...