Monitoring Splunk

Search peer has the following message: idx=_internal Throttling indexer, too many tsidx files in bucket='dir", is splunk optimizer running?

swmishra_splunk
Splunk Employee
Splunk Employee

Hello,
I am getting these messages , what is the action upon this? The disk space is not even near half,that shouldn't be the cause. Any guidance will be greatly appreciated.

Thanks

Tags (1)
0 Karma
1 Solution

swmishra_splunk
Splunk Employee
Splunk Employee

Kindly, check for which specific indexes and for which bucket directories it is giving the error.

Generally, whenever an index generates too many small tsidx files(more than 25) Splunk is not able to optimize all those files within the specified time period.

Kindly, run the below command against the specific directory to optimize it manually:-

splunk-optimize -d|--directory

Or you can make the below changes in Indexes.conf to fix the issue:-

indexes.conf
[default]
maxConcurrentOptimizes=25
maxRunningProcessGroups=12
processTrackerServiceInterval=0

Please go through the below documentation to have a better understanding of Splunk Optimization.
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Optimizeindexes

View solution in original post

0 Karma

edoardo_vicendo
Contributor

Hello, I need your thought on this topic.

With suggested modification I guess there is chance to completely fill the process queue (maxRunningProcessGroups) with splunk-optimize processes (maxConcurrentOptimizes) because maxConcurrentOptimizes=25 > maxRunningProcessGroups=12

https://docs.splunk.com/Documentation/Splunk/8.2.2/Admin/Indexesconf

By default maxConcurrentOptimizes=6 < maxRunningProcessGroups=8

 

maxConcurrentOptimizes = <nonnegative integer>
* The number of concurrent optimize processes that can run against a hot
  bucket.
* This number should be increased if:
  * There are always many small tsidx files in the hot bucket.
  * After rolling, there are many tsidx files in warm or cold buckets.
* You must restart splunkd after changing this setting. Reloading the
  configuration does not suffice.
* The highest legal value is 4294967295.
* Default: 6

 

 

 

maxRunningProcessGroups = <positive integer>
* splunkd runs helper child processes like "splunk-optimize",
  "recover-metadata", etc. This setting limits how many child processes
  can run at any given time.
* This maximum applies to all of splunkd, not per index. If you have N
  indexes, there will be at most 'maxRunningProcessGroups' child processes,
  not N * 'maxRunningProcessGroups' processes.
* Must maintain maxRunningProcessGroupsLowPriority < maxRunningProcessGroups
* This is an advanced setting; do NOT set unless instructed by Splunk
  Support.
* Highest legal value is 4294967295.
* Default: 8

 

 

On the other hand, in this post they just left maxConcurrentOptimizes to default increasing maxRunningProcessGroups only:

https://community.splunk.com/t5/Getting-Data-In/The-index-process-has-paused-data-flow-Too-many-tsid...

Therefore maxConcurrentOptimizes=6 < maxRunningProcessGroups=12

This setting seems to be better, but if the aim of this change is to increase the parallel processes of splunk-optimize (to avoid the warning message "The index processor has paused data flow. Too many tsidx files in idx= ..."), shouldn't be better increasing maxConcurrentOptimizes keeping it always lower than maxRunningProcessGroups ?

Thanks a lot,
Edoardo

0 Karma

edoardo_vicendo
Contributor

We solved the issue with this configuration in indexes.conf

[default]
maxRunningProcessGroups = 12
processTrackerServiceInterval = 0

What we observed is that splunk-optimize process runs for very few seconds, therefore is probably more important to set processTrackerServiceInterval=0 so that every second can be spawn a new process, instead of waiting 15 seconds as default.

With maxRunningProcessGroups=12 then there is even more "room" for splunk-optimize processes.

0 Karma

swmishra_splunk
Splunk Employee
Splunk Employee

Kindly, check for which specific indexes and for which bucket directories it is giving the error.

Generally, whenever an index generates too many small tsidx files(more than 25) Splunk is not able to optimize all those files within the specified time period.

Kindly, run the below command against the specific directory to optimize it manually:-

splunk-optimize -d|--directory

Or you can make the below changes in Indexes.conf to fix the issue:-

indexes.conf
[default]
maxConcurrentOptimizes=25
maxRunningProcessGroups=12
processTrackerServiceInterval=0

Please go through the below documentation to have a better understanding of Splunk Optimization.
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Optimizeindexes

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...