Development on the Shuttl app seems to have stopped, is this app still supported and going to be actively developed?
I started using this app to allow us to archive our frozen buckets in S3 however I disabled it due to concerns about the format of the stored data as I wasn't able to retrieve the data manually from S3 in any meaningful way.
I would like to start using this app again but not if development has stalled.
I have the same problem. I want to store frozen bucket in S3 in CSV format and I'm using shuttl to do this. The problem is that csv file stored on S3 is not a text file separated by commas but something else not human readable. All csv files relative to every bucket have the same size: 21 Bytes and the same content. Csv files are nested in a path of this type:
In addition to these files on the root of the bucket I found files like block_9139990103400054340 but they are human unreadable.
However if I try to restore s3 data with shuttl interface it works and I found the correct data in splunk.
So, archive and restore of Splunk frozen data with shuttl works but csv data stored on S3 are unreadable with other tool. Is it normal? I'm doing something wrong?
Elaborate on the format you prefer the data to be in when it finally lands in AWS. You state that you could not be retrieved in a meaningful way, Default movement from cold to frozen is buckets so you could copy the buckets back into Splunk and search. Other option is CSV where you could interpret the logs quite easily. Further, if you move them to AWS, what stops you from setting up Splunk on AWS and pointing to the native buckets sent from Shuttl? What stops you from indexing them if they are in CSV format? I am confused as to what format you want the logs in when they land in their final resting place?
Yes the default movement from cold to frozen means that you can copy buckets back to splunk, but the way these buckets are stored in S3 means I can't copy them back out an into splunk. I also couldn't run Splunk in AWS and read them there.
I suggest you try and setup Shuttl to archive into S3 and then access S3 using some other file transfer application (such as DragonDisk or CyberDuck) and see if you can meaningfully extract any buckets the way you suggest.
BTW, The Shuttl app does allows you to migrate the data in csv format as well as the bucket format, for both at the same time if you want to migrate buckets back or use the csv formatted data for other purposes.
I would like the data stored on S3 to be in a format that isn't seemingly tied to Shuttl like it is now. Preferably in files/directories just like if I were to specify coldToFrozenDir in the indexes.conf.
Try using the CSV format. During a test I migrated Splunk indexed data to a Hadoop data node using Shuttle with CSV and was able to search the data in immediately using Hunk.
Yes but have you actually tried to view the data directly on S3 and not using the Shuttl app? It is stored using a whole lot of block123456789... files in the root directory , there is an archiveroot folder that contains something that looks like the directory structure of an archived bucket but all the files are about 21 bytes. I can't see a way of mapping these little files to the block_* files in the root!