Hello,
I'm looking to set up a log retention policy for a specific index, for example index=test.
Here's what I'd like to configure:
- Total retention time = 24 hours
- First 12 hours in hot+warm, then
- Next 12 hours cold.
- After that, the data should be archived (not deleted).
How exactly should I configure this please? Also does the number of buckets need to be adjusted to support this setup properly on such a short timeframe?
Thanks in advance for your help.
Hi
This question are asked quite often. You find many explanations from community quite easily.
I add here some posts which you should read to understand better the problematic of your needs.
But shortly what those means when we are looking your request.
There are many attributes which you need to use to achieve your target, but I'm quite sure that you cannot use those so that you will get 100% what you are requesting.
@livehybrid already answer to you one example for starting point.
The 1st issue is that you cannot force warm -> cold transition by time the only options are amount of buckets and size of homePath also if you are using volumes, then total volume size are used, but usually you have also some other indexes on the same volume. And those are not depending on time, just # bucket and size of hot+warm bucket.
The 2nd issue is that depending on data volumes and amount of indexers it will be even harder to control the amount of buckets. All these configurations are depending on one indexer. There are no relations to other indexers and indexes what those have. And actually it's not even indexer dependent it's dependent on amount of indexing pipelines . So if you have e.g. 10 indexer all those parameters which @livehybrid present must multiply 10 and if you have e.g. 2 ingesting pipelines per indexer you must multiply previous result by 2. And as normally each indexer/pipeline have 3 open hot bucket you must again multiply previous result by 3 or if you have change that bucket amount then with some other value.
This means that when you are estimating needed amount of warm buckets to achieve that 12h time in hot you must divide your data by (3 * # pipeline * #indexers) to get estimate how many maxWarmDBCount you should use.
And to get this working correctly this means that your source system events must spread equally on all your indexers to calculate that value correctly. Of course this expecting that your data volume is flat for all time. If your data volumes follow eg. sin function then it's quite obvious that this cannot work.
One more thing is that if your events are not continuous by time then (e.g time by time there are some old logs or some events in future) those triggers create a new bucket and close old hot even it's not full.
I suppose that above are not all aspects which one must take care of to achive what you are asking.
You could try to achieve your objective, but don't surprise if you cannot get it to work.
r. Ismo
Hi @BRFZ
Configure the index in indexes.conf as follows to enforce your requirements:
Try this as an example indexes.conf:
[test]
homePath = $SPLUNK_DB/test/db
coldPath = $SPLUNK_DB/test/colddb
thawedPath = $SPLUNK_DB/test/thaweddb
# set bucket max age to 12h (hot→warm)
maxHotSpanSecs = 43200
# default size, can reduce for faster bucket rolling #
maxDataSize = auto
# keep small number of warm buckets, moves oldest to cold #
maxWarmDBCount = 1
# total retention 24h
frozenTimePeriodInSecs = 86400
# archive to this path, not delete
coldToFrozenDir = /archive/test
With this setup, data will move from hot→warm after 12h (due to maxHotSpanSecs), and oldest warm buckets will be rolled to cold (enforced by low maxWarmDBCount). Data will be kept for 24h in total before being archived.
The number of buckets (maxWarmDBCount, etc.) should be kept low to ensure data moves through states quickly for such a short retention. Splunk is optimised for longer retention; very short retention and frequent bucket transitions can increase management overhead, its generally advised to not have small buckets due to this however due to the small retention period you shouldnt end up with too many buckets here?
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing