Deployment Architecture

Simple 30 day Retention under 4.2.3??

Path Finder

Can anyone provide a simple stanza example for a 30 day retention in the indexes under 4.2.3, where:

  1. Hot/Warm for 10 days with 100GB local storage of fast spindles available.

  2. Then push to Cold for 20 days where there are TB's of storage.

  3. Then delete on the 30th day. No frozen.

All thoughts are greatly appreciated.

0 Karma
1 Solution

Splunk Employee
Splunk Employee

[your_idx_here]
homePath.maxDataSizeMB = 100000
coldPath.maxDataSizeMB = 2000000
frozenTimePeriodInSecs = 2592000
maxTotalDataSizeMB = 2000000
homePath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/db/
coldPath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/colddb/
maxDataSize = auto_high_volume

By default, Splunk deletes all archive (frozen) data. As long as you don't specify a path for a frozen directory, Splunk will delete.
Splunk 4.2.3 doesn't have a way to specify an age of data for hot/warm data. Therefore, it will stay on your fast disk until it reaches 100GB, or hits 30days (whichever comes first), and basically skip cold stage.

View solution in original post

Splunk Employee
Splunk Employee

[your_idx_here]
homePath.maxDataSizeMB = 100000
coldPath.maxDataSizeMB = 2000000
frozenTimePeriodInSecs = 2592000
maxTotalDataSizeMB = 2000000
homePath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/db/
coldPath = $SPLUNK_HOME/var/lib/splunk/your_idx_here/colddb/
maxDataSize = auto_high_volume

By default, Splunk deletes all archive (frozen) data. As long as you don't specify a path for a frozen directory, Splunk will delete.
Splunk 4.2.3 doesn't have a way to specify an age of data for hot/warm data. Therefore, it will stay on your fast disk until it reaches 100GB, or hits 30days (whichever comes first), and basically skip cold stage.

View solution in original post

Splunk Employee
Splunk Employee

good luck!

Don't forget to vote 😉

0 Karma

Path Finder

No worries on that.. I have headroom. I'll run with this for a few days and monitor before changing setting across all indexers. thx!

0 Karma

Splunk Employee
Splunk Employee

I should also add, that you should set sizes to leave some headroom on the filesystem. I.e. if you have exactly 100GB for the filesystem, you might set the max size of the index to 90GB. Modern versions of Splunk will ensure you keep at least 2GB free, but may shut down indexing if this limit is hit.

0 Karma

Splunk Employee
Splunk Employee

Not sure why Simon set that value originally, and I don't have the complete story here, so if there was a specific reason, you may want to find out from him.
That being said, you are absolutely correct, homePath.maxDataSizeMB = 100000 will be triggered before you reached anywhere near 300 warm buckets.
A good way to look at the way Splunk archives data is based on a series of triggers (size, age, count). If anyone of these triggers is tripped, action occurs.

0 Karma

Path Finder

You gotta ask SYep about those settings. Him and your engineers came out and set it up. So it sounds like even if I set the maxWarmDBCount to 300 and leave the MaxHotBuckets to 3, the maxDataSizeMB will still make sure I don't go over the 100GB and run out of space.

0 Karma

Splunk Employee
Splunk Employee

Sorry, auto_high_volume should have been included in my original answer. That will ensure that your buckets are 10GB in size (instead of 100MB). Why have you set maxWarmDBCount = 5? The default on this is 300. It is okay if 5 fits your retention schema. Just understand that your hot/warm data will never get above 80GB (5x10GB for warm 3x10GB for hot).
Also note, after making these changes, you may have to wait and watch how things roll (how long will depend on how much data Splunk is eating daily).

0 Karma

Path Finder

Also, The dates and time stamps of the hot/warm and cold all coincide with the time I made the change

0 Karma

Path Finder

I used to have maxDataSize set to auto_high_volume, but took it out per what you had shown above. So currently I have:
[default]
maxWarmDBCount = 5
maxHotBuckets = 3

[main]
homePath = /opt/splunk/var/lib/splunk/defaultdb/db
coldPath = /storage/defaultdb/colddb
homePath.maxDataSizeMB = 100000
coldPath.maxDataSizeMB = 2000000
frozenTimePeriodInSecs = 2592000
maxTotalDataSizeMB = 2000000

0 Karma

Splunk Employee
Splunk Employee

You should look in the directories to see what the files show. Each bucket will have a date range on it which should help you to figure out what is what. There are a few other settings that may control the size of data, including how many warm buckets you keep (maxWarmDBCount), wether or not you have maxDataSize = auto or auto_high_volume, and how often you rotate buckets (rotatePeriodInSecs).

It is very possible that you have maxDataSize = auto on the index(es) you are most concerned about, and thus they are writing only 100MB per bucket.

Let me know what you find!

0 Karma

Path Finder

Hmm... my hot/warm index was at 102GB and with these settings, a restart of splunk; once the splunk_optimizer finished running, my indexers are down to 31GB I have 71GB in the coldDB. Doesn't seem like it did what it was suppose to do by the numbers set. Thoughts?

0 Karma

Splunk Employee
Splunk Employee

Take a look here:

http://docs.splunk.com/Documentation/Splunk/4.2.3/Admin/HowSplunkstoresindexes#How_to_configure_inde...

The 'putting it all together' section has the example you're wanting to review

  1. You can't do home path(where hot/warm data is stored) based on time period, it needs to be based on size. Like this:

    homePath.maxDataSizeMB = 1000

  2. Similarly, you've got to use size for the cold path.

    coldPath.maxDataSizeMB = 1000

  3. You can set 'frozenTimePeriodInSecs' in the stanza to control the overall size.