Hello,
We are still facing the following issue when we put in maintenance mode our Indexer Cluster and we stop one Indexer.
Basically all the Indexers stop ingesting data, increasing their queues...
See more...
Hello,
We are still facing the following issue when we put in maintenance mode our Indexer Cluster and we stop one Indexer.
Basically all the Indexers stop ingesting data, increasing their queues, waiting for splunk-optimize to finish the job.
This usually happens when we stop the Indexer after a long time since last time.
Here below an example of the error message that appears on all the Indexers at once, on different bucket directory:
throttled: The index processor has paused data flow. Too many tsidx files in idx=myindex bucket="/xxxxxxx/xxxx/xxxxxxxxxx/splunk/db/myindex/db/hot_v1_648" , waiting for the splunk-optimize indexing helper to catch up merging them. Ensure reasonable disk space is available, and that I/O write throughput is not compromised.
Checking further, going into the bucket directory, I was able to see hunderds of .tsidx files. What splunk-optimize does is to merge those .tsidx files.
We are running Splunk Enterprise 9.0.2 and:
- on each Indexer the disk reach 150K IOPS
- we already performed this set-up that improved the effect, but hasn't solved it:
indexes.conf
[default]
maxRunningProcessGroups = 12
processTrackerServiceInterval = 0
Note: we kept maxConcurrentOptimizes=6 as default, because we have to keep maxConcurrentOptimizes <= maxRunningProcessGroups (this has been also confirmed by Splunk support, that informed me maxConcurrentOptimizes is no longer used (or used with less effect) since 7.x and it is there mainly for compatibility)
- I know since 9.0.x there is the possibility to manually run splunk-optimize over the affected buckets, but this seems to me more a workaround than a solution. Considering a deployment can have multiple Indexers it is not straightforward
What do you suggest to solve this issue?
Thanks a lot,
Edoardo