Deployment Architecture
Highlighted

Retention time settings and effectiveness in the real world

Path Finder

i'm trying to configure a retention period for a particular index of 6 months.
however it seems that i cannot accomplish my fulfillment.

My indexes.conf stanza for this index is:

[myindex]
blockSignSize = 100
maxHotSpanSecs = 604800
coldPath.maxDataSizeMB = 2500000
maxHotBuckets = 10
homePath.maxDataSizeMB = 228000
maxHotIdleSecs = 604800
maxDataSize = 10000
frozenTimePeriodInSecs = 15552000

I checked my buckets with this command:

| dbinspect index=myindex timeformat="%s" 
| rex field=path "^(?<base>.*)/[^\/]+" 
| stats min(earliestTime) as earliestTime by base| convert timeformat="%d/%m/%Y %T" ctime(earliestTime)

i see events of 2006. how is this possible?
i know that every bucket in each db_* directory inside the cold storage has to be older than the period, so i tuned a bit above, without any success.

# ls -lh /dbcolddata/sistemi/legale/colddb/*|grep '\-129' -B4 -A2
-rw------- 1 root root  512  3 ago 14:14 Strings.data

/dbcolddata/sistemi/legale/colddb/db_1310808120_1297165140_2273:
totale 68K
-rw------- 1 root root  42K  3 ago 14:14 1310808120-1297165140-5545846168426133067.tsidx
-rw------- 1 root root  435  3 ago 14:14 Hosts.data
-rw------- 1 root root   53  3 ago 14:14 optimize.result
--
-rw------- 1 root root  770 18 ago 14:33 Strings.data

/dbcolddata/sistemi/legale/colddb/db_1310979120_1310394360_2871:
totale 40K
-rw------- 1 root root  14K 21 ago 01:00 1310979120-1310394360-1290718078530768585.tsidx
-rw------- 1 root root  103 21 ago 01:00 Hosts.data
-rw------- 1 root root   53 21 ago 01:00 optimize.result
--
-rw------- 1 root root 1,1K 23 ago 00:26 Strings.data

/dbcolddata/sistemi/legale/colddb/db_1312247367_1297205580_2252:
totale 720M
-rw------- 1 root root 1,9M 23 ago 00:26 1312244605-1297205580-8212688385823764075.tsidx
-rw------- 1 root root 557M 23 ago 00:27 1312246567-1311766260-4368690169117699691.tsidx
-rw------- 1 root root  85M 23 ago 00:26 1312247367-1305852960-3943105201241615979.tsidx

Since this db directory will probably never fill, all events inside (modulus every single dir) will never be older than frozenTimePeriod, thus not being eligible for frozing?
maxDataSize is set high because i have different storages for hot (SAS) and cold (SATA), and i want the former to avoid creating lots of db directories; plus i'm 64bit.

i need to restrict maxDataSize to a rather low value, eg.1000 to force bucket to restrict their time span inside?

conversely, there's any command to manually roll cold>frozen besides deleting events with * | delete ?

This is really a constraint for me since in italy regulatory compliance says you have to rotate/flush logs beyond a 6 months period (only for ISPS); requirement is that i need to be very specific with dates when rolling cold>frozen.

thanks.

0 Karma
Highlighted

Re: Retention time settings and effectiveness in the real world

Path Finder

anyone can help?

0 Karma
Highlighted

Re: Retention time settings and effectiveness in the real world

Esteemed Legend

Maybe checking in a different way would be useful. I track our retention with a search like this:

index=_internal sourcetype=splunkd bucketmover "will attempt to freeze" | rex field=_raw "/splunk(?:/[^/]*)?/(?<indexname>[^/]*)/db/db_(?<newestTime>[^_]*)_(?<oldestTime>[^_]*)_.*" | dedup indexname | eval retentionDays=(now()-oldestTime)/(60*60*24) | stats values(retentionDays) as retentionDays by indexname';
0 Karma