What does this warning mean and how does it impact the performance?
The percentage of small of buckets created (40) over the last hour is very high and exceeded the yellow thresholds (30) for index=win, and possibly more indexes, on this indexer
What to do to remediate?
The alert is triggered when the percentage of a small bucket(by definition, less than 10% of maxDataSize for the index) is created more than the current thresholds(30) for the last 24 hours.
Please check the relevant configuration file as below:
displayname = Buckets
indicator:bucketscreatedlast60m:yellow = 40
indicator:percentsmallbucketscreatedlast24h:description = This indicator tracks the percentage of small buckets created over the last 24 hours. A small bucket is defined as less than 10 % of the ‘maxDataSize’ setting in indexes.conf.
indicator:percentsmallbucketscreatedlast24h:red = 50
If you'd like to disable or suppress the message, please check the following Splunk doc. http://docs.splunk.com/Documentation/Splunk/7.2.1/DMC/Configurefeaturemonitoring
As hot buckets roll to warm whenever Splunk is restarted, warm buckets can be created with a smaller size than the specified max size(‘maxDataSize’) for the index.
If this is the case in your environment, Splunk does not have control over this behaviour and having smaller buckets will not affect any performance issue.
If smaller buckets come from other reasons, you may need to investigate the reason.
I am having the exact same issue. I have opened a case with Splunk Support, but all they did was copy and paste the response kheo provided.
In our environment we have not restarted Splunk for months, so this is not the cause of it prematurely rolling hot buckets.
Same issue here, started last night. Would be nice to how to troubleshoot further. The affected index isn't particularly high volume nor have I observed any other unusual activity lately.
See the other thread which is pretty much a duplicate of this one, here: https://answers.splunk.com/answers/701550/health-status-the-percentage-of-small-of-buckets-c.html#an...
Long story short is that the issue is caused by data which is arriving out of chronological order, caused by timezone parsing or sending issues, which needs to be fixed by modifying your timezone parsers, and/or reconfiguring the sending systems to include TZ offset, or the props.conf to specify the TZ of each source.
As per simonq's comment, the most common cause is a large variance in timespan of the data coming in, this often relates to incorrect parsing of the data.
In Alerts for Splunk Admins I have a dashboard for issues by sourcetype and alerts around this, github link here. The monitoring console in modern versions also has a "Data Quality" tab which would help you here.
If the variation in timestamp is actually required for some reason you could increase the number of hot buckets in the indexes.conf (maxHotBuckets) however you would likely be better served by fixing any timestamp parsing issues...(maxHotBuckets does not normally need to be adjusted)
Could you be more specific on where in you App I do find the proper Dashboard to show this?
Try the dashboards:
Issues Per Sourcetype (for information about sourcetype issues, you will need to know the sourcetypes in the index in question)
or start with "Rolled Buckets By Index"
Even using your app, and looking in the log files, I couldn't find any evidence that it was actually creating 60% new buckets each hour as it claimed to be. I modified the /opt/splunk/etc/system/local/indexes.conf file and added the following to turn the internal buckets up to 5, from 3:
[_internal] homePath = $SPLUNK_DB/_internaldb/db coldPath = $SPLUNK_DB/_internaldb/colddb thawedPath = $SPLUNK_DB/_internaldb/thaweddb maxHotBuckets=5
After a restart it's now showing green and has stayed that way for about 20 hours now.