Currently working on data retention log collection policy to meet M-21-31
and not sure if the below config would meet the requirement
Current Requirement:
Hot: 6 months
Warm: 24 months
Cold: 18 months
Archive or Frozen: 18 months with data ceiling and data deletion
add these config to the Index Stanza to meet the above requirements
If not please let me know what the setting and or config would look like
Index.conf (add the below config to the Index Stanza)
maxHotSpanSecs = 15778476 - would provide around 6 months of hot bucket data
maxHotIdleSecs = 15778476
NOT sure about warm bucket setting to get 24 months of warm bucket data
coldPath.maxDataSizeMB = 47335428 - would provide around 18 months of cold bucket data
frozenTimePeriodInSecs = 47335428 - would provide around 18 months data archive / frozen data
coldToFrozenDir = "$SPLUNK_HOME/myfrozenarchive - send archive/froze to this location so it not deleted data
Regardless of what M-21-31 is, there is a very important issue with your "config".
The only time-based retention you can apply here is cold (assuming we're talking on-prem and we're not talking smartstore - that's territory I don't feel very comfortable with).
Warm and hot are limited using different criteria. You can try to estimate their limits but that's only gonna be that - a rough estimate.
Also - that will be the _limit_ and will tell you when data will _surely_ get rolled out to next tier and eventually to frozen whereas you most probably want it the other way - the limits under which the data will surely _not_ get rolled.
There probably aren't too many people here who know the M-21-31 requirements. However, there probably are a lot of people who could help you comply with those requirements if you tell what they are.
FWIW, maxHotSpanSecs is a maximum value, not a fixed value. Hot buckets could roll to warm before that time span is reached. Also, have a single bucket that spans 6 months is not a good idea - it could get to be too large. For best control of retention time, set the hot bucket time limit to 1 day (86400).
There is no mechanism for controlling how long a bucket is warm. Those only roll to cold based on size or count.
PickleRick
Thank you for this information, I understand bukcetsm, indexes and Indexer work how data retention process workwith Splunik
this is a virtual Splunk cloud environment (Splunk is installed on Cloud VMs), and we are NOT using SmartStore
just not sure how to config the Indexes.conf file / the individual
indexer.conf stanza to reflex the data retention requirements
of
Hot/Warm for 30 month
Cold for 30 months
frozen for 30 months
Long story short - it's not possible. Hot/warm cannot be time-limited. As simple as that. However many fancy calculations you do based on average bucket sizes and so on, do a few restarts across your clusters or get some bad quality data and you end up with many small buckets rolling out of warm faster than you can say "bucket lifecycle".
Anyway, it's relatively strange to see the same storage size allocated for hot/warm as for cold. Usually since cold is slower and cheaper there is way more of cold space than hot/warm.
Of course keeping frozen stored for adequate period of time is up to you so you can easily script it to wait for X days before removing the exported buckets.
Mr. Galloway
Thank you for your reply and input
The retention requirements are Data / Log Retention:
Hot / Warm for 30 Months which I broke out
Hot: 6 months
Warm: 24 months
But I understand that 6 months is to long per you reply and will adjust to 1 or 2 days
Cold: 18 months
Archive or Frozen: 18 months with data ceiling and data deletion
Revised indexes.conf
maxHotSpanSecs = 86400 or 172800 (1 or 2 day of hot bucket data)
maxHotIdleSecs = 86400 or 172800
maxWarmDBCount = The maximum number of warm buckets.
How does one set the warm bucket for SIZE - would prefer to use SIZE and NOT # of Buckets
coldPath.maxDataSizeMB = 47335428
frozenTimePeriodInSecs = 47335428
coldToFrozenDir = "$SPLUNK_HOME/myfrozenarchive
See this presentation https://conf.splunk.com/files/2017/slides/splunk-data-life-cycle-determining-when-and-where-to-roll-...
It will tell you what you're dealing with.