Getting Data In

Why my current index size is larger than the max index size?

tchaeunsang1
New Member

Hello guys, I want to discover things about indexes, so I created and index and I gave ita maximum size of 20MB, my collect is in real time but what I see is that the current size exceeds the max size, when I restart splunk I get 1MB as current size, fromw aht I know is that the data transformed from hot data to warm data but why isn't it the same when the cureent size exceeds the maximum size?
Thanks.

0 Karma
1 Solution

spavin
Path Finder

Hi @tchaeunsang1,

The index you created may have looked something like this:

[testindexsize]
homePath = $SPLUNK_DB\$_index_name\db
coldPath = $SPLUNK_DB\$_index_name\colddb
thawedPath = $SPLUNK_DB\testindexsize\thaweddb
homePath.maxDataSizeMB = 0 
coldPath.maxDataSizeMB = 0
maxTotalDataSizeMB = 20

Which makes an index with a maximum size of 20MB, with no other limits on how big the hot/warm or cold dbs can get.

When you add data it will go into a hot bucket. The bucket size defaults to "auto" which is 750MB. That means your data will grow into a 750MB bucket before rolling to warm (assuming you have data constantly coming in).

Once it rolls to warm, that's when the maxTotalDataSizeMB kicks in. It sees that the hot/warm db is taking up too much space, and so it starts to roll buckets to get back down to 20MB. That means the warm bucket is removed, and you're back to under 20MB.

The rolling process happens when:

  • The DB size gets higher than maxTotalDataSizeMB (it takes into account hot buckets, but will only roll from warm to cold and freeze cold, so your hot bucket is safe)
  • The most recent timestamp in the bucket is older than the frozenTimePeriodInSecs
  • splunkd is restarted - this will roll all hot buckets into warm buckets
  • You manually run: splunk _internal call /data/indexes/testindexsize/roll-hot-buckets

Try repeating your experiement with maxDataSize=1

This represents the maximum size in MB for a hot DB to reach before a roll to warm is triggered.

You should then see your index keeping much closer to the 20MB limit.

View solution in original post

0 Karma

tchaeunsang1
New Member

Thanks a lot for your answer ! Actually I changed the bucket size to 10MB so everytime the current size reach the maxData size, hot data rolls to warm data and current size decrease by 10MB (bucket size).

0 Karma

spavin
Path Finder

Hi @tchaeunsang1,

The index you created may have looked something like this:

[testindexsize]
homePath = $SPLUNK_DB\$_index_name\db
coldPath = $SPLUNK_DB\$_index_name\colddb
thawedPath = $SPLUNK_DB\testindexsize\thaweddb
homePath.maxDataSizeMB = 0 
coldPath.maxDataSizeMB = 0
maxTotalDataSizeMB = 20

Which makes an index with a maximum size of 20MB, with no other limits on how big the hot/warm or cold dbs can get.

When you add data it will go into a hot bucket. The bucket size defaults to "auto" which is 750MB. That means your data will grow into a 750MB bucket before rolling to warm (assuming you have data constantly coming in).

Once it rolls to warm, that's when the maxTotalDataSizeMB kicks in. It sees that the hot/warm db is taking up too much space, and so it starts to roll buckets to get back down to 20MB. That means the warm bucket is removed, and you're back to under 20MB.

The rolling process happens when:

  • The DB size gets higher than maxTotalDataSizeMB (it takes into account hot buckets, but will only roll from warm to cold and freeze cold, so your hot bucket is safe)
  • The most recent timestamp in the bucket is older than the frozenTimePeriodInSecs
  • splunkd is restarted - this will roll all hot buckets into warm buckets
  • You manually run: splunk _internal call /data/indexes/testindexsize/roll-hot-buckets

Try repeating your experiement with maxDataSize=1

This represents the maximum size in MB for a hot DB to reach before a roll to warm is triggered.

You should then see your index keeping much closer to the 20MB limit.

0 Karma
Get Updates on the Splunk Community!

Notification Email Migration Announcement

The Notification Team is migrating our email service provider from Postmark to AWS Simple Email Service (SES) ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...