Solved: Can I optimise search by increasing hot buckets? - Splunk Community

Getting Data In

Three questions in one.

Are hot buckets faster than warm for search.
If so is it because they are in memory or because the file is already open?
Is it a good idea to have 30+ hot buckets to speed up data access?

For background, we are indexing 100GB/day and searches over a few hours seem slow so looking for ways to optimise.

1 Solution

Solution

Hot buckets are not faster, they're merely the ones which are being written to. Increasing the number of them can help search performance, but in a subtle way: see below.

Sometimes, when you're indexing a lot of data from different sources, the subtle time differences between machines means that events arriving at the indexer are slightly offset from one another in time. Splunk likes to keep the timeline relatively smooth within a given bucket, so it might write event #1 to one bucket, but event #2 in another, to align with the time of events already in those buckets.

So now a new event arrives, and it's got a time stamp that belongs in neither bucket #1 nor bucket #2. Splunk creates a new bucket. But if I now have more hot buckets than the maximum allowed, it's time to rotate one to warm. Let's say we selected bucket #2 to go to warm. Now it's closed up, it's files are no longer being written to, and it enters the warm state. But bucket #2 was only 100M when it was rolled. That's pretty small for a bucket, especially when you're indexing 100G / day.

The search performance part of this discussion is here: If you're rolling buckets too fast, and ending up with a lot of small buckets, then search performance will be hampered as to find events, we have to open more and more buckets.

You can see why buckets are being rolled with a search like this one:

index=_internal source=*splunkd.log databasePartitionPolicy moving

You'll get events from Splunk which indicate why the bucket went from hot to warm. If it's for reasons like "exceeded maxHotBuckets", then you might not have enough. The "main" index has defaults set up for indexing a lot of data. It uses ten (10) max hot buckets, and uses the "auto_high_volume" parameter for a size limit (10G on 64-bit systems). If you're indexing at a high volume to an index other than main, it might benefit you to mimic some of the config of the main index.

Finally, have a look here about ways to evaluate search performance, and optimize your searches.

View solution in original post

Solution

Hot buckets are not faster, they're merely the ones which are being written to. Increasing the number of them can help search performance, but in a subtle way: see below.

Sometimes, when you're indexing a lot of data from different sources, the subtle time differences between machines means that events arriving at the indexer are slightly offset from one another in time. Splunk likes to keep the timeline relatively smooth within a given bucket, so it might write event #1 to one bucket, but event #2 in another, to align with the time of events already in those buckets.

So now a new event arrives, and it's got a time stamp that belongs in neither bucket #1 nor bucket #2. Splunk creates a new bucket. But if I now have more hot buckets than the maximum allowed, it's time to rotate one to warm. Let's say we selected bucket #2 to go to warm. Now it's closed up, it's files are no longer being written to, and it enters the warm state. But bucket #2 was only 100M when it was rolled. That's pretty small for a bucket, especially when you're indexing 100G / day.

The search performance part of this discussion is here: If you're rolling buckets too fast, and ending up with a lot of small buckets, then search performance will be hampered as to find events, we have to open more and more buckets.

You can see why buckets are being rolled with a search like this one:

index=_internal source=*splunkd.log databasePartitionPolicy moving

You'll get events from Splunk which indicate why the bucket went from hot to warm. If it's for reasons like "exceeded maxHotBuckets", then you might not have enough. The "main" index has defaults set up for indexing a lot of data. It uses ten (10) max hot buckets, and uses the "auto_high_volume" parameter for a size limit (10G on 64-bit systems). If you're indexing at a high volume to an index other than main, it might benefit you to mimic some of the config of the main index.

Finally, have a look here about ways to evaluate search performance, and optimize your searches.

I dont think Hot buckets are faster. Hot and Warm buckets occupy the same disk. I think the only main differance is Hot are open for write operations. I do know that when splunk restarts hot bucket are immediatly rolled to warm. have you though about segmenting your data into different indexes based on event? Also how much search optimization have you done, how many concurrent searches are running, and can you use summary indexing to roll up your event into smaller buckets? Do you extraction use a lot of regex or delims to break data?

Get Updates on the Splunk Community!

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

Thursday, July 9, 2026 | 11:00AM–12:00PM PDT Duration: 1 hour (includes Q&A) Managing can feel like a ...

Upgrade Prep for 10.4, Network Observability Deep Dives, and More from Splunk Lantern

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...

Splunk Developer Day announcements: AI agents, MCP tools, Forecasting, and Custom ...

Splunk Developer Day was packed with product and platform updates for developers building in the AI ...