We have been getting messages about high percentage of small buckets. I set logging to DEBUG on one of our indexers (Windows, Splunk 7.3.4, index cluster). Using this SPL: index=_internal sourcetype=splunkd component=HotBucketRoller "finished moving" to try and see why buckets are rolling.
Looking at Interesting Fields, "caller" seems to indicate the reason for the bucket roll; however, only 2 of the 4 reasons make sense to me, and I can't find them documented anywhere. The values I return are:
size_exceeded, bucket_replication_failed
lru, marked
The first 2 are self-evident, but what are the last 2? I'm mostly interested in LRU, as that makes up 30-40% of our buckets rolls.
Any insight on this? My Google-fu has failed.
Hi @mosmond,
LRU means "Least Recent Used". If you see that much LRU , your events timestamps may have problems or you are indexing delayed data. This may cause quarantine hot buckets. You can find information below;
https://docs.splunk.com/Documentation/Splunk/latest/Admin/Indexesconf
maxHotBuckets = <positive integer> | auto
* Maximum number of hot buckets that can exist per index.
* When 'maxHotBuckets' is exceeded, the indexer rolls the hot bucket
containing the least recent data to warm.
* Both normal hot buckets and quarantined hot buckets count towards this
total.
* This setting operates independently of maxHotIdleSecs, which can also
cause hot buckets to roll.
* NOTE: the indexer applies this limit per ingestion pipeline. For more
information about multiple ingestion pipelines, see
'parallelIngestionPipelines' in the server.conf.spec file.
* With N parallel ingestion pipelines, the maximum number of hot buckets across
all of the ingestion pipelines is N * 'maxHotBuckets', but only
'maxHotBuckets' for each ingestion pipeline. Each ingestion pipeline
independently writes to and manages up to 'maxHotBuckets' number of hot
buckets. Consequently, when multiple ingestion pipelines are configured, there
may be multiple hot buckets with events on overlapping time ranges.
* The highest legal value is 4294967295
* If you specify "auto", the indexer sets the value to 3.
* This setting applies only to event indexes.
* Default: "auto"
If this reply helps you an upvote is appreciated.
@scelikokThat is just what I needed, thanks!
Hi @mosmond,
LRU means "Least Recent Used". If you see that much LRU , your events timestamps may have problems or you are indexing delayed data. This may cause quarantine hot buckets. You can find information below;
https://docs.splunk.com/Documentation/Splunk/latest/Admin/Indexesconf
maxHotBuckets = <positive integer> | auto
* Maximum number of hot buckets that can exist per index.
* When 'maxHotBuckets' is exceeded, the indexer rolls the hot bucket
containing the least recent data to warm.
* Both normal hot buckets and quarantined hot buckets count towards this
total.
* This setting operates independently of maxHotIdleSecs, which can also
cause hot buckets to roll.
* NOTE: the indexer applies this limit per ingestion pipeline. For more
information about multiple ingestion pipelines, see
'parallelIngestionPipelines' in the server.conf.spec file.
* With N parallel ingestion pipelines, the maximum number of hot buckets across
all of the ingestion pipelines is N * 'maxHotBuckets', but only
'maxHotBuckets' for each ingestion pipeline. Each ingestion pipeline
independently writes to and manages up to 'maxHotBuckets' number of hot
buckets. Consequently, when multiple ingestion pipelines are configured, there
may be multiple hot buckets with events on overlapping time ranges.
* The highest legal value is 4294967295
* If you specify "auto", the indexer sets the value to 3.
* This setting applies only to event indexes.
* Default: "auto"
If this reply helps you an upvote is appreciated.