Hi everyone,
I'm planning to create some indexes for compliance requirements to remove the old data.
I wanted to start small and created this index, but I'm having a hard time understanding how it works.
[test]
homePath = /splunkdb/test/db
coldPath = /splunkdb/test/colddb
thawedPath =/splunkdb/test/thaweddb
coldToFrozenDir =/splunkdb/test//frozen
maxTotalDataSizeMB = 5000
maxHotBuckets = 3 (3 hours in Hot)
maxHotSpanSecs = 3600 ( 1 hour)
maxHotIdleSecs = 0
maxWarmDBCount = 3 3( Hours in Warm)
frozenTimePeriodInSecs = 7200 ( 2 hours)
from my understanding, I should I have 6 hours of data in Hot/Warm buckets.
what I don't understand is how frozenTimePeriodInSecs works.
it is now 10:40 and It has been almost 18 hours since I created that index. I created that index at 16:14.
I have 7 buckets in my Frozen bucket.
The oldest data is from 23:40. So I have almost 10 hours of searchable events.
I Don't have anything in my Colddb.
These file are in my db folder:
-rw-------. 1 splunk splunk 10 Aug 7 16:14 CreationTime
-rw-------. 1 splunk splunk 0 Aug 7 20:02 db_1565223179_1565219580_0.rbsentinel
-rw-------. 1 splunk splunk 0 Aug 7 20:01 db_1565226054_1565223180_1.rbsentinel
-rw-------. 1 splunk splunk 0 Aug 7 21:02 db_1565229653_1565226054_2.rbsentinel
-rw-------. 1 splunk splunk 0 Aug 7 22:01 db_1565233253_1565229654_3.rbsentinel
-rw-------. 1 splunk splunk 0 Aug 7 23:42 db_1565236853_1565233254_4.rbsentinel
-rw-------. 1 splunk splunk 0 Aug 8 03:21 db_1565240453_1565236854_5.rbsentinel
-rw-------. 1 splunk splunk 0 Aug 8 10:21 db_1565246453_1565240454_6.rbsentinel
drwx--x---. 2 splunk splunk 6 Aug 7 16:14 GlobalMetaData
drwx--x---. 3 splunk splunk 4096 Aug 8 11:24 hot_v1_7
drwx--x---. 3 splunk splunk 4096 Aug 8 11:24 hot_v1_8
drwx--x---. 3 splunk splunk 4096 Aug 8 11:25 hot_v1_9
These files are in my frozen folder:
drwx--x---. 3 splunk splunk 21 Aug 7 20:02 db_1565223179_1565219580_0
drwx--x---. 3 splunk splunk 21 Aug 7 20:01 db_1565226054_1565223180_1
drwx--x---. 3 splunk splunk 21 Aug 7 21:02 db_1565229653_1565226054_2
drwx--x---. 3 splunk splunk 21 Aug 7 22:01 db_1565233253_1565229654_3
drwx--x---. 3 splunk splunk 21 Aug 7 23:42 db_1565236853_1565233254_4
drwx--x---. 3 splunk splunk 21 Aug 8 03:21 db_1565240453_1565236854_5
drwx--x---. 3 splunk splunk 21 Aug 8 10:21 db_1565246453_1565240454_6.
I was thinking that after 6 hours ( 6 buckets), every hour the oldest bucket will get deleted. So after 18 hours, I was expecting to have 3 hours of data deleted. But I think I was wrong and I don't know how it works now.
can anyone please help me understand how these FROZEN bucket rotation/rollover works? with my current settings how much searchable data I should be able to see?
Best,
Arsalan
A bucket is frozen once the newest event in that bucket is at least frozenTimePeriodInSecs
old.
You cannot expect to have 6 hours of searchable data with maxHotSpanSecs=3600
and frozenTimePeriodInSecs = 7200
. That's a total of 3 hours. The exact amount of searchable data depends on the incoming data rate and accuracy of timestamps (among other factors).
@richgalloway
that's a total of 3 hours in the hot buckets.
also there will be 3 hours in warm according to this: maxWarmDBCount = 3
Because of maxHotSpanSecs=3600, every hour it creates a new bucket.
The timestamps should be correct.
So how should I proceed?
For GDPR, we have to delete application logs after 3 months.
I have to understand how this works in order to implement it on a bigger scale. This is what I was planning to do for 6 months of retention:
maxTotalDataSizeMB = 500000
maxHotBuckets = 3 >>>> 3 days of data
maxHotSpanSecs = 1 day
maxHotIdleSecs = 0
maxWarmDBCount = 177
frozenTimePeriodInSecs = 180 days
You can't get 6 hours of data with a 2-hour retention time.
Also, if GCPR says delete after 3 months, why is frozenTimePeriodInSecs 6 months?
@richgalloway
The 6 months was just an example.....
So for 6 months, in this case, do you agree with those values?
one of the guys in slack just told me to just modify frozenTimePeriodInSecs( 180days) and maxHotSpanSecs (7days).
I don't know which one is the most preferred way to do this.
I agree with the Slacker.