Getting Data In

Splunk Add-On for AWS Small Hot Buckets- Why are we receiving warning?

andrew_burnett
Path Finder

We are getting the small hot buckets warning for this index, but the timestamps look fine just with a few hours offset. Not quite sure where to go from here.

Labels (1)
0 Karma

shivanshu1593
Builder

"A few hours offset" can be a big contributing factor here. How big of an offset are we talking here? If you haven't changed the value of maxHotBuckets for the index, it defaults to auto, which the indexers use to the set the value to 3. If the timestamp of the data is all over the place (Some in the future, some really old data), Splunk ingests it but force roll a hot bucket to create a new one for the data with unusual timestamp and if it happens frequently, you find a lot of small hot buckets being created in a short span of time.

  • Please check if the data coming from AWS is being accepted and stored "in the future" by running searches with index=yourindex sourcetype=aws:* earliest=+5m latest=+7d. If the volume is considerably large, this could be a big contributor to the error.
  • Please look for errors like "Accepted time is suspiciously far away from the previous event's time". This can tell you if the events that got ingested have timestamps far back in the past and Splunk ended creating hot buckets for them.
  • Creating a custom props.conf to define TIME_PREFIX, MAX_TIMESTAMP_LOOKAHEAD and TIME_FORMAT along with line breaking to ensure that Splunk reads the timestamp properly from your data.
  • Doing a sanity check of the data itself. I've seen some log sources in  some environments, where timestamp within the log source was all over the place all the time. Ended up with DATETIME_CONFIG = current to resolve the problem. 

Hope this helps,

###If it helps, kindly consider accepting as an answer/upvoting###

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
0 Karma

andrew_burnett
Path Finder

And all the timestamps are the same TZ, so no weird differing times that I can see either.

0 Karma

andrew_burnett
Path Finder

So the add-on came with props, and what I mean by offset is that all the events are in a timezone 6 hours ahead of us, but when I search it converts it to my time. When I tried to search it failed with this message "Unable to parse the search: Invalid time bounds in search: start=1654750800 > end=1654198380."

0 Karma

shivanshu1593
Builder

My bad. I wrote the timeranges incorrectly for the search. Have updated the answer above.

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
0 Karma

andrew_burnett
Path Finder

I have no events in the future.

0 Karma

shivanshu1593
Builder

Interesting. Could you kindly share as to which version of Splunk are you using and what's the % of hot buckets that Splunk is giving in the error message.

Please try running the search from this post and see if the index that you got the error message for gets identified. 

https://community.splunk.com/t5/Getting-Data-In/The-percentage-of-small-of-buckets-is-very-high-and-...

 

Could you also check if:

  • The issue is present in all of your indexers.
  • Were the affected indexers restarted recently and had any issues accepting the data (Can be found by looking into splunkd.log)
Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
0 Karma

shivanshu1593
Builder

Could you please run the following search for last last 7 days and see if it returns the name of the affected indexer. If it doesn't returns a result, please try to take the "Received shutdown signal." string and run it in the splunkd.log of the indexer (If you have access to its box).

index=_internal "Received shutdown signal." sourcetype=splunkd component!="SearchParser" 
| dedup host 
| stats max(_time) as _time by host

 

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
0 Karma

andrew_burnett
Path Finder

Beyond just a second ago when I restarted it, I have nothing popping up.

0 Karma

shivanshu1593
Builder

Okay. This seems interesting though. We've eliminated the most common reasons for this issue. The only ones that remain are:

  • Checking if there are events with timestamp extraction issues by searching the string "is suspiciously far away from the previous event's time" and checking if it is happening for your affected log source.
  • Network connectivity issue between HF and Indexers (Which is highly unlikely)

 

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
0 Karma

andrew_burnett
Path Finder

Well there is no errors in Splunk for that sourcetype, so nothing like that is flagging in the data. And all the connections seem fine, there is other add-ons on that HF that are reporting fine.

0 Karma

andrew_burnett
Path Finder

I am running 8.2.4 with 69% of small buckets and it's only flagging on one of my indexers. And I don't see any errors in splunkd regarding that add-on.

0 Karma

andrew_burnett
Path Finder

Restarting the indexer got rid of the problem for now, not sure it's going to fix the underlying problem.

0 Karma

shivanshu1593
Builder

Ah. Was the instance recently restarted as well? If there's no problem with the log source, you should not face the issue again anytime soon hopefully. Restarting the indexers rolls all the hot buckets to warm, so that would have done the trick for now.

Thank you,
Shiv
###If you found the answer helpful, kindly consider upvoting/accepting it as the answer as it helps other Splunkers find the solutions to similar issues###
0 Karma

andrew_burnett
Path Finder

It came back within 2 hours and is now affecting two indexers.

0 Karma

jamie00171
Communicator

Hi @andrew_burnett 

Can you run the following search please:

index=_internal component=HotBucketRoller idx=<insert impacted index name here>
| stats count by calller

which will show why the buckets are rolling to hot.

Then you can run:

index=_internal component=IndexWriter idx=<insert impacted index name here>

 which should show more details about why new hot buckets and being created and, if it was due to the timestamp of an event it will show it. 

Thanks, 

Jamie

0 Karma

andrew_burnett
Path Finder

This is the update of the first search.

andrew_burnett_0-1654526730675.png

 

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...