Can anyone provide a simple stanza example for a 30 day retention in the indexes under 4.2.3, where:
Hot/Warm for 10 days with 100GB local storage of fast spindles available.
Then push to Cold for 20 days where there are TB's of storage.
Then delete on the 30th day. No frozen.
All thoughts are greatly appreciated.
[your_idx_here]
homePath.maxDataSizeMB = 100000
coldPath.maxDataSizeMB = 2000000
frozenTimePeriodInSecs = 2592000
maxTotalDataSizeMB = 2000000
homePath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/db/
coldPath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/colddb/
maxDataSize = auto_high_volume
By default, Splunk deletes all archive (frozen) data. As long as you don't specify a path for a frozen directory, Splunk will delete.
Splunk 4.2.3 doesn't have a way to specify an age of data for hot/warm data. Therefore, it will stay on your fast disk until it reaches 100GB, or hits 30days (whichever comes first), and basically skip cold stage.
[your_idx_here]
homePath.maxDataSizeMB = 100000
coldPath.maxDataSizeMB = 2000000
frozenTimePeriodInSecs = 2592000
maxTotalDataSizeMB = 2000000
homePath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/db/
coldPath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/colddb/
maxDataSize = auto_high_volume
By default, Splunk deletes all archive (frozen) data. As long as you don't specify a path for a frozen directory, Splunk will delete.
Splunk 4.2.3 doesn't have a way to specify an age of data for hot/warm data. Therefore, it will stay on your fast disk until it reaches 100GB, or hits 30days (whichever comes first), and basically skip cold stage.
good luck!
Don't forget to vote 😉
No worries on that.. I have headroom. I'll run with this for a few days and monitor before changing setting across all indexers. thx!
I should also add, that you should set sizes to leave some headroom on the filesystem. I.e. if you have exactly 100GB for the filesystem, you might set the max size of the index to 90GB. Modern versions of Splunk will ensure you keep at least 2GB free, but may shut down indexing if this limit is hit.
Not sure why Simon set that value originally, and I don't have the complete story here, so if there was a specific reason, you may want to find out from him.
That being said, you are absolutely correct, homePath.maxDataSizeMB = 100000 will be triggered before you reached anywhere near 300 warm buckets.
A good way to look at the way Splunk archives data is based on a series of triggers (size, age, count). If anyone of these triggers is tripped, action occurs.
You gotta ask SYep about those settings. Him and your engineers came out and set it up. So it sounds like even if I set the maxWarmDBCount to 300 and leave the MaxHotBuckets to 3, the maxDataSizeMB will still make sure I don't go over the 100GB and run out of space.
Sorry, auto_high_volume should have been included in my original answer. That will ensure that your buckets are 10GB in size (instead of 100MB). Why have you set maxWarmDBCount = 5? The default on this is 300. It is okay if 5 fits your retention schema. Just understand that your hot/warm data will never get above 80GB (5x10GB for warm 3x10GB for hot).
Also note, after making these changes, you may have to wait and watch how things roll (how long will depend on how much data Splunk is eating daily).
Also, The dates and time stamps of the hot/warm and cold all coincide with the time I made the change
I used to have maxDataSize set to auto_high_volume, but took it out per what you had shown above. So currently I have:
[default]
maxWarmDBCount = 5
maxHotBuckets = 3
[main]
homePath = /opt/splunk/var/lib/splunk/defaultdb/db
coldPath = /storage/defaultdb/colddb
homePath.maxDataSizeMB = 100000
coldPath.maxDataSizeMB = 2000000
frozenTimePeriodInSecs = 2592000
maxTotalDataSizeMB = 2000000
You should look in the directories to see what the files show. Each bucket will have a date range on it which should help you to figure out what is what. There are a few other settings that may control the size of data, including how many warm buckets you keep (maxWarmDBCount), wether or not you have maxDataSize = auto or auto_high_volume, and how often you rotate buckets (rotatePeriodInSecs).
It is very possible that you have maxDataSize = auto on the index(es) you are most concerned about, and thus they are writing only 100MB per bucket.
Let me know what you find!
Hmm... my hot/warm index was at 102GB and with these settings, a restart of splunk; once the splunk_optimizer finished running, my indexers are down to 31GB I have 71GB in the coldDB. Doesn't seem like it did what it was suppose to do by the numbers set. Thoughts?
Take a look here:
The 'putting it all together' section has the example you're wanting to review
You can't do home path(where hot/warm data is stored) based on time period, it needs to be based on size. Like this:
homePath.maxDataSizeMB = 1000
Similarly, you've got to use size for the cold path.
coldPath.maxDataSizeMB = 1000
You can set 'frozenTimePeriodInSecs' in the stanza to control the overall size.