Hint: check out the splunk dump command. Assuming this is for already indexed data and depending on how much data your searches return this can be a quick and dirty way to dump to SH local disk then you can just have some process run the aws s3 cp commands. The benefit of this is you can get formatted output with fields you want to retain in the plaintext data that will end up in the files. also can be compressed and files can be rolled, etc. Can also be triggered via api or federated search. Downside, depending on the definition of "tonnes" and how much time those tonnes span, the searches may need to be well thought out and broken down into chunks of time. Command is marked as "internal" unsupported, but depending on your needs may be fine. Requires single long running search that could fail. This is where something like posting the job then programmatically pulling down the results and validating with some logic might be more robust depending on your needs. Other options would likely have you looking at surgery with Splunk formatted buckets which, is probably lowerl level then you need/want to go if some other system needs to eat it. If this is also new data streaming in, id be looking at Ingest Actions to output to S3.
... View more