Goal is to set export Data from Splunk Indexes to Hadoop.
We have a clustered Search head cluster. The doc doesn't mention about deploying the Splunk Hadoop Connector app in a cluster.
The below page says its not supported in a clustered SH.
https://answers.splunk.com/answers/368847/is-hadoop-connect-supported-in-a-splunk-search-hea.html?ut...
So my Question is:
If you want to get data export to indexers, how about Hadoop Data Roll?
Brief answer cause I wasn't positive why and the specifics about getting the data to Hadoop. Knowing more about that would help provide alternative solutions for this. Cool?
If you want to get data export to indexers, how about Hadoop Data Roll?
Brief answer cause I wasn't positive why and the specifics about getting the data to Hadoop. Knowing more about that would help provide alternative solutions for this. Cool?
Hi BUrch,
The goal is to move the data from Splunk to Hadoop/S3 for longer data retention. Currently we store data in Splunk only for two months.
We want to send data to Hadoop and the analytics team wants to further analyse this HIVE , etc
I understand
1) We can simply forward data to another system as it arrives at Splunk
or
2) we can export data after it has been cooked by Splunk.
However not sure how to achieve first part.
Hadoop connect allows data export to Hadoop but it doesnt seem to be supported in a clustered architecture. Thus looking out for options.
I believe Hadoop Data Roll is used when I want to use Splunk again for archived data analytics, which is not needed for us. This archived index will follow the same two month retention period right? -- Please correct me if Im wrong.
Thanks for the extra detail. It sounds like Hadoop Data Roll is exactly what you want. It's the ideal way to roll data from Splunk to Hadoop. Yes it will allow you to search it still but you don't have to do that AND it will have DIFFERENT retention than the index did in Splunk. Read through the documentation and I'm confident you'll agree that it's 100% exactly what you want to use.
Thakyou for the notes. I will go through the documentation and get back incase of further queries.
Hi @SloshBurch ,
I did go through the documentation and had few queries - I have posted these queries at the below link as well:
Can I use Hadoop Techniques like Hive, Pig..etc for analytics on the archived data sent to Hadoop via Hadoop Data Roll?
AND
Can I use Hadoop Techniques like Hive, Pig..etc for analytics on the archived data sent to Hadoop via Splunk Hadoop Connect Export?
I came across Splunk Archive Bucket Reader - an additional app is required to analyze the archived data via Hadoop's applications like Pig , Hive , Spark.
Is this a mandatory app required If I want to analyse the Hadoop data?
Is Splunk Analytics for Hadoop license license required for sending archived data to Hadoop?
Thanks @saranya_fmr. I was at .conf2017 and then another work trip hence the delayed response. I will respond in the new thread you created. Thanks for starting that.
Also, if we answered your initial question, please make sure to accept that answer so others know if its worth exploring this thread or not.