Hello everyone I have been trying to understand how this alert works because for my point of view doesn't make sense.
This message NEVER disappears from our splunk instances and I have been trying to catch the real root cause but I don't have clear the way this works.
I have this message:
The percentage of small buckets (75%) created over the last hour is high and exceeded the red thresholds (50%) for index=foo, and possibly more indexes, on this indexer. At the time this alert fired, total buckets created=11, small buckets=8
So I checked if the logs have Time parsing issue and there are not issues with the logs indexed by foo index.
Then I checked with this search:
index=_internal sourcetype=splunkd component=HotBucketRoller "finished moving hot to warm" | eval bucketSizeMB = round(size / 1024 / 1024, 2) | table _time splunk_server idx bid bucketSizeMB | rename idx as index | join type=left index [ | rest /services/data/indexes count=0 | rename title as index | eval maxDataSize = case (maxDataSize == "auto", 750, maxDataSize == "auto_high_volume", 10000, true(), maxDataSize) | table index updated currentDBSizeMB homePath.maxDataSizeMB maxDataSize maxHotBuckets maxWarmDBCount ] | eval bucketSizePercent = round(100*(bucketSizeMB/maxDataSize)) | eval isSmallBucket = if (bucketSizePercent < 10, 1, 0) | stats sum(isSmallBucket) as num_small_buckets count as num_total_buckets by index splunk_server | eval percentSmallBuckets = round(100*(num_small_buckets/num_total_buckets)) | sort - percentSmallBuckets | eval isViolation = if (percentSmallBuckets > 30, "Yes", "No") | search isViolation = Yes | stats count
I ran that search for the last 2 days and the result is ZERO
But the red-flag is still there...
So I am not understanding what is going on.
Here is the log the indicate that foo is rolling from hot to warm
08-30-2022 02:12:27.121 -0400 INFO HotBucketRoller [1405281 indexerPipe] - finished moving hot to warm bid=foo~19~AAD3329E-C8D9-4607-90FB-167760B4EB6F idx=foo from=hot_v1_19 to=db_1661054400_1628568000_19_AAD3329E-C8D9-4607-90FB-167760B4EB6F size=797286400 caller=size_exceeded _maxHotBucketSize=786432000 (750MB), bucketSize=797315072 (760MB)
So as I can see the reason is logic caller=size_exceeded due to the size.
Just for information this index receives data just once a day midnight.
If you have any inputs I would really appreciate it.
... View more