Deployment Architecture

Backup Indexes to S3 without Hadoop

jbruce506
Explorer

Here's the situation - we have a non-developer, new to Splunk, without access to Hadoop (or any basic understanding of it) trying to backup indexed data to AWS S3. The documentation provides a lot of detail on how indexed data is stored but it doesn't give any definitive details on how to backup the data. There's a number of references to Hadoop using the Data Roll or Hunk, but we're not using Hadoop at all.

What would be the simplest way to 1) do daily incremental backups of the warm buckets to S3 and 2) archive frozen buckets to Glacier so no data is lost?

Tags (2)
0 Karma

jbruce506
Explorer

After re-reading the documentation several times over AND piecing together other info from Answers, I think this may be a slightly less complex than what I originally thought. From what I gather, you don't actually need a Hadoop cluster in place to implement Hadoop Data Roll. You need to install the Hadoop client version 2.6 or better and Java version 1.4 or better on the Splunk indexer/search head. Once installed, there are configuration options in the Splunk Web UI to setup index archiving with prewritten scripts that can backup to either a Hadoop cluster HDFS or Amazon S3 bucket, as seen here, https://docs.splunk.com/Documentation/Hunk/6.4.8/Hunk/ArchiveSplunkindexes.

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...