Getting Data In

Halp! My data is being rolled to frozen and I don't know why!

Splunk Employee
Splunk Employee

I need to know why my data is being rolled to frozen - is it because of time or disk space?

1 Solution

Splunk Employee
Splunk Employee

With this handy dandy query, you can find out why!

index=_internal sourcetype=splunkd component=BucketMover "Will attempt to freeze" | rex field=_raw "/logs/splunk_logs/(?P<index_name>[^\/]+)/(db|colddb)/db_(?P<earliest_epoch>[\d]+)_(?P<latest_epoch>[\d]+)_(?P<bucket_number>[\d]+)\' (?P<reason>.*)"  | convert ctime(earliest_epoch) as Earliest_Data  | convert ctime(latest_epoch) as Latest_Data  | convert ctime(_time) as Log_TimeStamp | table Log_TimeStamp,index_name,bucket_number,Earliest_Data,Latest_Data,reason | sort - Log_TimeStamp

Breakdown of the fields:

Log_TimeStamp = _time of the log entry
index_name = index that's being frozen
bucket_number = bucket that's being frozen
Earliest_Data = Earliest date of events that was in that bucket
Latest_Data = Latest date of events that was in that bucket
Reason = the reason it was rolled to frozen.

Edit: Addition query for windows!
Just an updated version for windows (since there's wackiness around escaping \'s):

index=_internal sourcetype=splunkd component=BucketMover "Will attempt to freeze" | rex field=_raw "(?P<index_name>[^\\\\]+)\\\\(db|colddb)\\\\db_(?P<earliest_epoch>[\d]+)_(?P<latest_epoch>[\d]+)_(?P<bucket_number>[\d]+)' (?P<reason>.*)"  | convert ctime(earliest_epoch) as Earliest_Data  | convert ctime(latest_epoch) as Latest_Data  | convert ctime(_time) as Log_TimeStamp | table Log_TimeStamp,index_name,bucket_number,Earliest_Data,Latest_Data,reason | sort - Log_TimeStamp

View solution in original post

SplunkTrust
SplunkTrust

I also have a few searches for this in Alerts for Splunk Admins or github

For example:
IndexerLevel - Buckets are been frozen due to index sizing

index=_internal sourcetype=splunkd (source=*splunkd.log) "BucketMover - will attempt to freeze" NOT "because frozenTimePeriodInSecs=" 
| rex field=bkt "(rb_|db_)(?P<newestDataInBucket>\d+)_(?P<oldestDataInBucket>\d+)"
| eval newestDataInBucket=strftime(newestDataInBucket, "%+"), oldestDataInBucket = strftime(oldestDataInBucket, "%+") 
| table message, oldestDataInBucket, newestDataInBucket

Note the above alert excludes time based rolling, it detects the issue where you hit the size limits and data rolled to frozen...

0 Karma

Builder

I came across this post in my own searches, I used the search from @bosburn_splunk but updated it for my instance (v7.0) and thought I'd share what I used incase the syntax has changed from what was originally posted in 2014.

What you'll need to edit in the SPL is the path within the logs. You'll see the patterns below. I am matching on both patterns, that's why in my query you'll see it starts with '((pattern1)|(pattern2))...

bkt='<path>...'
OR
candidate='<path>...'

My working query

index=_internal sourcetype=splunkd component=BucketMover "Will attempt to freeze" 
| rex field=_raw "'((\/mnt\/local\/(?<bucket>[^\/]+))|(\/opt\/splunk\/var\/lib\/splunk))\/(?<index_name>[^\/]+)\/(db|colddb)\/(db|rb)_(?<latest_epoch>\d+)_(?<earliest_epoch>\d+)_(?<bucket_number>\d+)_(?<guid>[A-Z0-9-]+)' because (?<Reason>.*)"
| convert ctime(earliest_epoch) as Earliest_Data 
| convert ctime(latest_epoch) as Latest_Data 
| convert ctime(_time) as Log_TimeStamp 
| table Log_TimeStamp,index_name,bucket,bucket_number,Earliest_Data,Latest_Data,Reason 
| sort - Log_TimeStamp

I referenced this table for the bucket naming conventions.

Splunk Employee
Splunk Employee

With this handy dandy query, you can find out why!

index=_internal sourcetype=splunkd component=BucketMover "Will attempt to freeze" | rex field=_raw "/logs/splunk_logs/(?P<index_name>[^\/]+)/(db|colddb)/db_(?P<earliest_epoch>[\d]+)_(?P<latest_epoch>[\d]+)_(?P<bucket_number>[\d]+)\' (?P<reason>.*)"  | convert ctime(earliest_epoch) as Earliest_Data  | convert ctime(latest_epoch) as Latest_Data  | convert ctime(_time) as Log_TimeStamp | table Log_TimeStamp,index_name,bucket_number,Earliest_Data,Latest_Data,reason | sort - Log_TimeStamp

Breakdown of the fields:

Log_TimeStamp = _time of the log entry
index_name = index that's being frozen
bucket_number = bucket that's being frozen
Earliest_Data = Earliest date of events that was in that bucket
Latest_Data = Latest date of events that was in that bucket
Reason = the reason it was rolled to frozen.

Edit: Addition query for windows!
Just an updated version for windows (since there's wackiness around escaping \'s):

index=_internal sourcetype=splunkd component=BucketMover "Will attempt to freeze" | rex field=_raw "(?P<index_name>[^\\\\]+)\\\\(db|colddb)\\\\db_(?P<earliest_epoch>[\d]+)_(?P<latest_epoch>[\d]+)_(?P<bucket_number>[\d]+)' (?P<reason>.*)"  | convert ctime(earliest_epoch) as Earliest_Data  | convert ctime(latest_epoch) as Latest_Data  | convert ctime(_time) as Log_TimeStamp | table Log_TimeStamp,index_name,bucket_number,Earliest_Data,Latest_Data,reason | sort - Log_TimeStamp

View solution in original post

Splunk Employee
Splunk Employee

Just an updated version for windows (since there's wackiness around escaping \'s):

index=internal sourcetype=splunkd component=BucketMover "Will attempt to freeze" | rex field=raw "(?P[^\\]+)\\(db|colddb)\\db(?P<earliestepoch>[\d]+)(?P<latestepoch>[\d]+)(?P<bucketnumber>[\d]+)' (?P.*)" | convert ctime(earliestepoch) as EarliestData | convert ctime(latestepoch) as LatestData | convert ctime(time) as LogTimeStamp | table LogTimeStamp,indexname,bucketnumber,EarliestData,LatestData,reason | sort - LogTimeStamp

0 Karma

Path Finder

Had to modify rex for my db's from /logs/splunk_logs to /mydatadir/\w+

Extraordinary, thanks!

0 Karma