Question about max total and hot/warm/cold size

vitojij183 · ‎11-09-2020

hi,

i configure my index like this :

# volume definitions

[volume:hotwarm_cold]
path = /mnt/fast_disk
maxVolumeDataSizeMB = 5976884

# index definition (calculation is based on a single index)

[main]
homePath = volume:hotwarm_cold/defaultdb/db
coldPath = volume:hotwarm_cold/defaultdb/colddb
thawedPath = $SPLUNK_DB/defaultdb/thaweddb
homePath.maxDataSizeMB = 768000
coldPath.maxDataSizeMB = 2304000
maxWarmDBCount = 4294967295
frozenTimePeriodInSecs = 10368000
maxDataSize = auto_high_volume
coldToFrozenDir = /mnt/fast_disk/defaultdb/frozendb

but in index management i see

Max Size of the Entire Index: 500000

what does Max Size of Entire Index do? and i configure my hot/warm size to 750gb, what happens in my index reach Max Size of Entire Index value?

the second question is what does Max Size of Hot/Warm/Cold Bucket do? and what is the difference between auto and auto_high_volume?

best regards

96nick · ‎11-09-2020

When your index hits 500000 MB (500G), data will be rolled over to frozen. If you don't have a frozen path/script in place when that occurs the data will be deleted.

That 500G size limit comes from the default indexes.conf. The setting used to override that is maxTotalDataSizeMB. You'll have to set that value to a value you're comfortable with in order to take advantage of the 750G settings you set for that index. The 500GB setting is applied regardless of the extra space you gave hot/warm.

The max size of Hot/Warm/Cold is simply what it says. It is the maximum amount of space (in MBs) that can be taken up by each phase of the data lifecycle. These are broken up by homepath.maxDataSizeMB (hot/warm) and coldpath.maxDataSizeMB (cold). When those values are hit, the buckets are sent to the next phase in the data lifecycle. Side note, there is no way to separate out max values of hot and warm.

maxDataSize works with hot buckets only as the setting is the size a hot bucket can reach before the bucket rolls to warm. Typically for high volume indexes, you would set auto_high_volume so that data rolls. If you set a inactive/slow index with auto_high_volume you risk the data staying in hot. This is bad since data retention policy is applied at cold, so your data will sit in hot buckets and be sad. More on that on this answer.

Hope that helped!

Question about max total and hot/warm/cold size

indexer

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Announcing Modern Navigation: A New Era of Splunk User Experience

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Join the Conversation

Question about max total and hot/warm/cold size

indexer

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Announcing Modern Navigation: A New Era of Splunk User Experience

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue