Getting Data In

Managing bucket sizes?

frankejj
Explorer

In my indexes I set maxTotalDataSizeMB = 8000

However, today my 18GB volume was full. I looked in the defaultdb directory and all of the files in the hot buckets were consuming all of the disk space. There was nothing located in warm/cold/frozen locations as I would have expected from reading the documentation on managing indexes/disk.

I manually deleted defaultdb/db/hot* and it freed up 15+ GB.

Can someone please explain why this does not seem to be working as expected or is there something else to check?

Thanks, John

Tags (2)
0 Karma
1 Solution

jrodman
Splunk Employee
Splunk Employee

The logic of maxTotalDataSizeMB is applied once the buckets have left the state of 'hot' or being actively written to. Since you can have many GB in hot buckets, this value gets quite off for very small indexes.

Please understand, a typical call from customers is talking about indexes in the size range from hundreds of GB to hundreds of TB, and these settings work well at those sizes.

For non-main indexes, the default bucket size is much smaller (100MB i think?), and the default hot bucket count is 1, which means this issue doesn't end up mattering a lot there.

It sounds like you've created indexes of your own settings, which is good. You'll probably want to tune maxDataSize for those indexes (bucket size constraint) as well as maxHotBuckets (2 or 3 are good numbers for most smaller indexes).

Incidentally the overall sizing story is admittedly a bit baroque, and in 4.2 we have the idea of setting aside a pool of space and telling a group of indexes to simply stay inside that space, which I think will address most user's needs.

View solution in original post

jrodman
Splunk Employee
Splunk Employee

The logic of maxTotalDataSizeMB is applied once the buckets have left the state of 'hot' or being actively written to. Since you can have many GB in hot buckets, this value gets quite off for very small indexes.

Please understand, a typical call from customers is talking about indexes in the size range from hundreds of GB to hundreds of TB, and these settings work well at those sizes.

For non-main indexes, the default bucket size is much smaller (100MB i think?), and the default hot bucket count is 1, which means this issue doesn't end up mattering a lot there.

It sounds like you've created indexes of your own settings, which is good. You'll probably want to tune maxDataSize for those indexes (bucket size constraint) as well as maxHotBuckets (2 or 3 are good numbers for most smaller indexes).

Incidentally the overall sizing story is admittedly a bit baroque, and in 4.2 we have the idea of setting aside a pool of space and telling a group of indexes to simply stay inside that space, which I think will address most user's needs.

frankejj
Explorer

Thanks for the info. I will increase the disk space on the indexer volumes and try to tune them a bit further. I am mostly working with the main index (defaultdb) but will switch to dedicated index if necessary.

Looking forward to 4.2 for some improvements in this area, but I can certainly make do with what is available at the moment.

Cheers!

0 Karma

rsimieng
Splunk Employee
Splunk Employee

Inside your $SPLUNK_HOME/etc/system/local/indexes.conf, did you apply maxTotalDataSizeMB specifically to an index of your own creation or to [default]?

As a test, create an index [test] and limit the size maxTotalDataSizeMB = 5 and run data through. Check your index size (gui --> manager is fine). If you get a chance, post your indexes.conf.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...

Deep insights, no barriers: Splunk Observability Cloud Free Edition

As software delivery cycles continue to accelerate, observability shouldn’t be a luxury — it should be a ...

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...