Hi,
I have a splunk server that has tonnes of data in it. What we would like to do is have a system on a dedicated search head, that does a search lookup, then exports the data it finds to an S3 bucket for another system to ingest and do analysis on.
I have looked at several adds including Export Everything, and S3 Uploader for Splunk, but neither of them have clear instructions and I am having issues.
Are there any resources that are clear on how to setup the connection to export search results from Splunk into an S3 bucket?
Hint: check out the splunk dump command.
Assuming this is for already indexed data and depending on how much data your searches return this can be a quick and dirty way to dump to SH local disk then you can just have some process run the aws s3 cp commands.
The benefit of this is you can get formatted output with fields you want to retain in the plaintext data that will end up in the files. also can be compressed and files can be rolled, etc. Can also be triggered via api or federated search.
Downside, depending on the definition of "tonnes" and how much time those tonnes span, the searches may need to be well thought out and broken down into chunks of time.
Command is marked as "internal" unsupported, but depending on your needs may be fine.
Requires single long running search that could fail.
This is where something like posting the job then programmatically pulling down the results and validating with some logic might be more robust depending on your needs.
Other options would likely have you looking at surgery with Splunk formatted buckets which, is probably lowerl level then you need/want to go if some other system needs to eat it.
If this is also new data streaming in, id be looking at Ingest Actions to output to S3.