Health Status : The percentage of small of buckets...

jflaherty · ‎11-21-2018

I have been getting the following type message for the _internal and other indexes: The percentage of small of buckets created (75) over the last hour is very high and exceeded the red thresholds (50) for index=_internal, and possibly more indexes, on this indexer.

What could be causes of this and how do I go about troubleshooting to determine what the cause of this may be? I have not been able to find anything yet in logs.

Thanks

Ajinkya1992 · ‎01-23-2019

I have the same case with Splunk 7.2.3 on my test env.
I just have installed Splunk and have set up the whole arch with 1 DS, 1Cluster Master, 2 Indexers, 1 Search head, 1 Application server.

I have all my servers in same timezones which are having exact same timing on all the servers.
Also, I have not even started forwarding any data to my indexers.
So I can roll out both the possibilities like coming data from data sources as well as timezone issues.

Do you guys have any other suggestion than this which can help to solve this issue?

jflaherty · ‎01-23-2019

You probably already did this but if not, i would grep your splunkd.log for words like DateParserVerbose and WARN and ERROR etc.. You could also see if you can find exactly what index has too many buckets. If it is one of your own or one of the built in Splunk indexes, that could help narrow your search for the issue.

simonq · ‎01-21-2019

I too opened a case with Splunk regarding this, and we identified that the cause was due to both problems with the timezone parsing causing Splunk to think that events were months in the future, and also due to the source systems being in different timezones, and some sending records with timestamps in the local (eg EST, CST, MST) tz, and others sending UTC, so it appeared to be 5 hours in the future.

The months in the future issues were fixed by modifying the timestamp parsers for those records, the timezone offset issue I am still trying to work our how to solve.

julian0125 · ‎02-13-2019

Hello @simonq,

i have the same issue, did you already fix it? i would be very helpful to me

simonq · ‎02-14-2019

Hi Julian,

Yes, as my reply above says, we resolved this issue. It was mainly fixed by running a search to show us events "from the future" (eg "*" using date/time range between half an hour or so in the future and say a week in the future) , in order to identify the data sources which were configured and/or had datestamps being parsed incorrectly (eg UTC/GMT being parsed as 5 hours in the future, since we are in EST/UTC-5), and then fixing all of these to return correct/sane values, and also by increasing our hot buckets for the indexer DB, which had been set to 3 (I think I set it to 5), and finally restarting Splunk.

It's been a couple of weeks now and the warning has not returned.

jfaldmomacu · ‎07-09-2019

This helped me out. Thank you!

jacobpevans · ‎08-16-2019

In case anyone else finds this, I've done a pretty big write-up on two answers. Start here: https://answers.splunk.com/answers/725555/what-does-this-message-mean-regarding-the-health-s.html?ch...

Cheers,
Jacob

If you feel this response answered your question, please do not forget to mark it as such. If it did not, but you do have the answer, feel free to answer your own post and accept that as the answer.

julian0125 · ‎02-14-2019

that is perfect! can you share us you search? thanks for your help @simonq

simonq · ‎02-14-2019

The search is literally "*" and then the date time range set to the future.

Esky73 · ‎02-14-2019

something like :

index=* latest=+24h earliest=+1h will show those events timestamping in the future.

tbalouch · ‎12-19-2018

I have been seeing a lot of this on Splunk Enterprise version 7.2.1

Esky73 · ‎01-14-2019

@jflaherty

Seeing this issue also - how did you go with Splunk Support ?

jflaherty · ‎01-23-2019

Not sure exactly what you are asking but Splunk support closed the case right after they provided the DateParserVerbose error answer. As far as I can tell that is likely the problem. I still have some bad data sources from databases etc that have output that throws the error and makes too many buckets every once and a while.

jflaherty · ‎12-19-2018

I have been dealing with Splunk support on this issue. They think it may be the fact that I have some data sources that have events that are way off in time from other events with the same source (due to some devices incorrectly being set in different time zone than all others). When time is that far off, apparently Splunk does not know how to deal with it and sticks it in a separate index. Do you see messages like this in your splunkd.log?

12-19-2018 10:08:38.875 -0500 WARN DateParserVerbose - Accepted time (Wed Dec 19 10:08:35 2018) is suspiciously far away from the previous event's time (Wed Dec 19 15:08:35 2018), but still accepted because it was extracted by the same pattern. Context: source=/log/switch/switch.log|host=XXXXX|switch|401549

tbalouch · ‎12-19-2018

You know, actually you bring up a great time. I will need to recheck the NTP settings on my Splunk Enterprise Indexers just to make sure. Thanks this is very useful!

There wasn't a setting in indexes.conf that you could tweak to change the hot bucket rollover threshold from 50 to 75?

Health Status : The percentage of small of buckets created (75) over the last hour is very high and exceeded the red thresholds (50) for index=_internal....

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...

Join the Conversation

Health Status : The percentage of small of buckets created (75) over the last hour is very high and exceeded the red thresholds (50) for index=_internal....

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...