Deployment Architecture

How to retrieve the actual data file from indexed data?

ankithreddy777
Contributor

Hi ,
My file got indexed. Unfortunately both the actual file and the indexed data got deleted but we have backup for indexed data.
We are trying to retrieve the raw data from indexed data backup and ingest the file again in to Splunk. How to retrieve the actual data from indexed data?

0 Karma

gjanders
SplunkTrust
SplunkTrust

You probably want to have a read of Restore archived indexed data , Splunk does not have your original file as such, it has data inside buckets.
While the buckets have the rawdata inside them, you would not normally attempt to decompress the raw data and re-index it again, you would follow the above procedure toe restore archived indexed data as you do mention you have a backup of this.

Note that if you do re-index the data it will cost license, if you restore a frozen bucket then it does not cost license.

ankithreddy777
Contributor

Hi, Thank you for your reply, I have the backup of complete index files. Can I just delete all the index files in each bucket except the raw data directory and place that bucket in the thawed path?. Otherwise should I place entire bucket in the thawed path including the index files?

As per Splunk docs archived data(rawdata) needs to be placed in Thawed Directory so that it will rebuilt the index files. Need your suggestion

0 Karma

gjanders
SplunkTrust
SplunkTrust

Restore the entire bucket and then run the splunk rebuild command, I have done that before....

If it does have any problems you could delete anything other than the rawdata directory inside the bucket & then leave only the journal.gz inside the rawdata directory and remove everything else...

I've only had to restore thawed data once !

0 Karma

ankithreddy777
Contributor

thank you, I have buckets staring with
db_3463746.....
rb_64238787....

which type of buckets I should move to thawed path to restore entire data

Can we run the rebuilt command to build all buckets in thawed path of that index at once.

Thanks in advance

0 Karma

gjanders
SplunkTrust
SplunkTrust

db is the original bucket, rb is a replicated bucket from another cluster member, I'm assuming your running indexer clustering?

As per https://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/Restorearchiveddata you can :

Clustered data thawing

You can thaw archived clustered data
onto individual peer nodes the same
way that you thaw data onto any
individual indexer. However, as
described in "Archive indexed data",
it is difficult to archive just a
single copy of clustered data in the
first place. If, instead, you archive
data across all peer nodes in a
cluster, you can later thaw the data,
placing the data into the thawed
directories of the peer nodes from
which it was originally archived. You
will end up with replication factor
copies of the thawed data on your
cluster, since you are thawing all of
the original data, including the
copies.

Note: Data does not get replicated
from the thawed directory. So, if you
thaw just a single copy of some
bucket, instead of all the copies,
only that single copy will reside in
the cluster, in the thawed directory
of the peer node where you placed it.

I would just write a for loop to run the rebuild command...

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...