Splunk Search

I disabled global metadata on some indexers and now my indexing queue is blocked. Why?

DerekB
Splunk Employee
Splunk Employee

We upgraded from 4.2 to 4.3.5 because we had a sources.data that was many GB in size. To resolve this, we tried to upgrade to 4.3.5, which has the ability to disable global metadata. When we did this, the behavior described in title started. Why did this happen, and what can we do about it?

1 Solution

jbsplunk
Splunk Employee
Splunk Employee

This is a bug, tracked as SPL-60031, is to be fixed in 4.3.6. This isn't a problem in 5.0. If global metadata is disabled, the marker file (.repairedBucket) is not being cleared as this file gets cleared inside DatabasePartitionPolicy::rebuildMetaData. This causes the bucket manifest to being regenerated very frequently(several times a second in some cases). It results in the indexing queue blocked and filling, causing backups into all other queues.

A workaround for the issue can be implemented by modifying this setting in indexes.conf:

serviceMetaPeriod = <nonnegative integer>

    Defines how frequently metadata is synced to disk, in seconds. After changing and restarting the indexer, we no longer saw blocked index queues. 
    Defaults to 25 (seconds).
    You may want to set this to a higher value if the sum of your metadata file sizes is larger than many
    tens of megabytes, to avoid the hit on I/O in the indexing fast path.
    Highest legal value is 4294967295

Changing it from 25 to 150 seemed to help quite a bit with the case I saw of this behavior.

View solution in original post

jbsplunk
Splunk Employee
Splunk Employee

This is a bug, tracked as SPL-60031, is to be fixed in 4.3.6. This isn't a problem in 5.0. If global metadata is disabled, the marker file (.repairedBucket) is not being cleared as this file gets cleared inside DatabasePartitionPolicy::rebuildMetaData. This causes the bucket manifest to being regenerated very frequently(several times a second in some cases). It results in the indexing queue blocked and filling, causing backups into all other queues.

A workaround for the issue can be implemented by modifying this setting in indexes.conf:

serviceMetaPeriod = <nonnegative integer>

    Defines how frequently metadata is synced to disk, in seconds. After changing and restarting the indexer, we no longer saw blocked index queues. 
    Defaults to 25 (seconds).
    You may want to set this to a higher value if the sum of your metadata file sizes is larger than many
    tens of megabytes, to avoid the hit on I/O in the indexing fast path.
    Highest legal value is 4294967295

Changing it from 25 to 150 seemed to help quite a bit with the case I saw of this behavior.

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...