Getting Data In

When is old data deleted from indexes?How does frozenTimePeriodInSecs get applied

Motivator

Hi

I have an index that has its frozenTimePeriodInSecs set to 90 days. When inspect that index with the rest command I see that the index has events from 2008:

| rest /services/data/indexes | search title=xx | eval now=now() | convert ctime(now) as now |fields title,frozenTimePeriodInSecs,minTime,now

title     frozenTimePeriodInSecs minTime                   now 
xx         7776000                2008-04-01T22:00:29+0200 06/18/2013 12:39:49

When i search for the events from 2008:

index=xx | convert ctime(_indextime) as indextime | eval delta=_indextime-_time | table _time,indextime,delta

I can see that the indextime of the old events is within the 90 day span of the index.

Is there a quarantine applied to recently indexed events with a funny date? I can't imagine that the indextime is relevant for the frozenTimePeriodInSecs. Does anyone know how this setting is applied?

Thanks

Chris

0 Karma
1 Solution

Ultra Champion

Operations on indexed data is performed on the bucket level, not on individual events within a bucket. Buckets are only frozen (deleted/archived) when the newest event in the bucket is older than frozenTimePeriodInSecs.

Thus, you might have a bucket that contains both new and really old data, but the really old data won't be frozen until all of the data in the bucket is 'too old'. Perhaps you imported some old historical data (which would explain the diff between _time and _indextime for some events. Or perhaps your timestamps were misinterpreted (to an older date).

_indextime however, is not involved/considered with frozenTimePeriodInSecs

/K

View solution in original post

Explorer

| dbinspect index=*
| rename state as category
| stats min(startEpoch) as earliestTime max(endEpoch) as latestTime sum(sizeOnDiskMB) as MB by index category
| convert timeformat="%m/%d/%Y" ctime(earliestTime) as earliestTime ctime(latestTime) as latestTime

SplunkTrust
SplunkTrust

Hi chris,

have you checked splunkd.log for any message form BucketMover component of splunkd?

index=_internal source=*splunkd.log* *BucketMover* NOT INFO

Maybe you get some ideas from them.

cheers, MuS

Ultra Champion

Operations on indexed data is performed on the bucket level, not on individual events within a bucket. Buckets are only frozen (deleted/archived) when the newest event in the bucket is older than frozenTimePeriodInSecs.

Thus, you might have a bucket that contains both new and really old data, but the really old data won't be frozen until all of the data in the bucket is 'too old'. Perhaps you imported some old historical data (which would explain the diff between _time and _indextime for some events. Or perhaps your timestamps were misinterpreted (to an older date).

_indextime however, is not involved/considered with frozenTimePeriodInSecs

/K

View solution in original post

Explorer

May be this will give you similar info
| dbinspect index=*
| rename state as category
| stats min(startEpoch) as earliestTime max(endEpoch) as latestTime sum(sizeOnDiskMB) as MB by index category
| convert timeformat="%m/%d/%Y" ctime(earliestTime) as earliestTime ctime(latestTime) as latestTime

0 Karma

Motivator

Thank Kristian

0 Karma

Builder

Hi @ kristian.kolb

You have mentioned that "Buckets are only frozen (deleted/archived) when the newest event in the bucket is older than frozenTimePeriodInSecs."

The newest event keeps on changing every second.
How can we determine when will the bucket gets deleted.
does the retention policy apply only on the cold buckets. What if the warm buckets have data older than 60 days and it has not rolled to cold because it hasn't reached the rolling limit of the bucket.

0 Karma

SplunkTrust
SplunkTrust

dammit /K you're too fast.....
you answered while I was typing my answer 🙂

0 Karma