Deployment Architecture

How do I restore data from a frozen bucket?


We had a situation where we lost data for last quarter for one specific Index. We have 4tb HDD space on the indexers and 1.9 is being used.
Later with the help of a support engineer, we realized that that index was allocated 500gb of Space and hence the data was deleted.
(for the time being we have increased the value to 700gb for maxTotalDataSizeMB.)
As we do not have an archive path setup, we will not be able to restore the data.

In our internal testing, we restored the buckets from frozen db to thawdb in a new index and used the rebuild command.
We were able to get data back from 2010 to 2014 from the frozen db.
So my question is, if we have data from 2010 to 2014, how do I find the data for the last quarter? the data is missing only from march 10th to june 12th

0 Karma


The simple and probably disheartening answer is "where is the data itself?"

As you found, if you can get your hands on the data as you did with the 2010-2014 stuff, you can likely figure out a way to make it searchable again. If you do not have that data Splunk can't generate it out of thin air. If you have backups from that time you could look there.

If you still have any of the original source data around you could ingest that again (checking MAX_DAYS_AGO = <integer> settings in props.conf to be sure you'll get correct timestamps since they'll be old events!). Perhaps you have backups of the syslog/whatever server's log files from during that period? You could restore those back (again after adjusting your timestamp settings as mentioned previously)

Otherwise that data may be gone for good.


Rich is completely right here. You either have the data, or you don't. Admittedly there is some grey area here. You could have the data and not realize it, or you could think you have the data but actually not. The important docs for the mechanics of thawing the data are here:

But, I think what might be most useful is to help you understand how buckets are named. When you see a bucket named something like:


This bucket covers a time range of Mon Aug 1 09:11:26 EDT 2016 to Tue Aug 2 14:58:42 EDT 2016. The two numbers there in the directory name are time_t values corresponding to the oldest and newest events in the bucket. So you will have to find the buckets that are named in a way that their timestamps overlap the time ranges you are seeking...

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...