All Apps and Add-ons

How to put cold and frozen data on s3 in AWS

RecoMark0
Path Finder

Is it possible to directly put cold and frozen data on s3 in AWS? The only solution I found was through shuttl, from this question:
https://answers.splunk.com/answers/56522/frozen-archives-into-amazon-s3.html?utm_source=typeahead&ut...

However, we would like to find a solution that does not use a script. Is this possible?

Thank you

Tags (3)
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi RecoMark0,

well, the docs http://docs.splunk.com/Documentation/Splunk/6.2.4/Admin/Indexesconf are pretty clear on this:

coldToFrozenScript = [path to script interpreter] <path to script>
        * Specifies a script to run when data will leave the splunk index system.  
          * Essentially, this implements any archival tasks before the data is deleted out of its default location.

It can only be done via some script; if no script is set/defined Splunk will delete the frozen data.

cheers, MuS

View solution in original post

awurster
Contributor

hey @RecoMark0 here's our example:
https://bitbucket.org/asecurityteam/atlassian-add-on-cold-to-frozen-s3/overview

i had trouble trying to execute it with Splunk's included python - couldn't locate s3cmd 😕 - so we just forced it to use the OS one.

see also this discussion:
https://answers.splunk.com/answers/56522/frozen-archives-into-amazon-s3.html#answer-351891

0 Karma

sarnagar
Contributor

Hi @ awurster ,

We have a similar goal of moving data to S3. I was going through your link and had few queries.
https://bitbucket.org/asecurityteam/atlassian-add-on-cold-to-frozen-s3/overview

Would be really helpfull if you can clarify please as I'm new to AWS.

1) What does the below "launching splunk with an instance profile" mean ? Where do I configure this?

Usage

Ensure your Splunk indexers are launched with an instance profile which permits uploading to the S3 bucket.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::example-bucket"
]
},
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::example-bucket/*"
]
}
]
}

2) Can your script be implemented in a clustered Splunk architecture as well??

0 Karma

sduff_splunk
Splunk Employee
Splunk Employee

The way Splunk handles Cold and Frozen buckets is outlined here,
http://docs.splunk.com/Documentation/Splunk/6.2.4/Indexer/HowSplunkstoresindexes

For cold buckets, you only have the option of providing a filesystem location, which is where the data will roll to once it reaches a certain age. For frozen, you could use the Splunk script, shuttl, which is built for the exact task you are after.

Alternatively, you could mount S3 as a filesystem location (e.g., http://tecadmin.net/mount-s3-bucket-centosrhel-ubuntu-using-s3fs/). I would imagine performance would not be great, and I don't think it would be a supported solution.

Recommend that you use shuttl, what is your aversion to doing so?

kevinmanson
Explorer

Aversion to using shutti? Isn't that a dead project?

How is Splunk cloud doing this for warm buckets?
Splunk Cloud’s backup/archiving process encrypts customer data
within separate Simple Storage Service (S3) buckets using AES
256-bit encryption. Keys are rotated on a routine basis and are under
continuous monitoring. Archiving takes place when customer hot
buckets roll to warm buckets, a process that regularly occurs based
on every 10GB of data ingested or every 24 hours (whichever comes
first)

awurster
Contributor

ditto. just to be clear there are disclaimers written all over the shuttle repo:

Shuttl development has stalled.
There's no known developer working on
this project.

Shuttl seems to work for Splunk 6.x
when it's built off the develop
branch, but it's experimental.

...

WARNING: The examples provided in this guide are for testing only and should not be used in production environments without consulting splunk documentation.

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi RecoMark0,

well, the docs http://docs.splunk.com/Documentation/Splunk/6.2.4/Admin/Indexesconf are pretty clear on this:

coldToFrozenScript = [path to script interpreter] <path to script>
        * Specifies a script to run when data will leave the splunk index system.  
          * Essentially, this implements any archival tasks before the data is deleted out of its default location.

It can only be done via some script; if no script is set/defined Splunk will delete the frozen data.

cheers, MuS

View solution in original post

basu42002
Path Finder

Hello Everyone,

I am using the below in indexes.conf file, but the script never got executed instead the frozen files are deleted.
frozenTimePeriodInSecs = 1382400
coldToFrozenScript = "/opt/splunk/bin/python" "/opt/splunk/etc/apps/atl-cold-to-frozen-s3/bin/coldToFrozenS3.py"

Do I need to create sub folders on S3 bucket? because manually executing the script, without the sub folders works.

However if I manually execute the script something like "python coldtofrozens3.py arguments", its copying the data to s3.
Also I have tried coldToFrozenDir=, which is working.
But the coldtofrozen script never works. I am unable to test the script, as i am losing the frozen data.

Can some one please help/suggest what is going wrong here.

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!