All Apps and Add-ons

Is Splunk Hadoop Connect app supported in a clustered Search head?

Communicator

Goal is to set export Data from Splunk Indexes to Hadoop.

We have a clustered Search head cluster. The doc doesn't mention about deploying the Splunk Hadoop Connector app in a cluster.

The below page says its not supported in a clustered SH.
https://answers.splunk.com/answers/368847/is-hadoop-connect-supported-in-a-splunk-search-hea.html?ut...

So my Question is:

  1. Is it supported in a Search Head Cluster? If no , what are the other options to send Splunk Index data into Hadoop ?
0 Karma
1 Solution

Ultra Champion

If you want to get data export to indexers, how about Hadoop Data Roll?

Brief answer cause I wasn't positive why and the specifics about getting the data to Hadoop. Knowing more about that would help provide alternative solutions for this. Cool?

View solution in original post

0 Karma

Ultra Champion

If you want to get data export to indexers, how about Hadoop Data Roll?

Brief answer cause I wasn't positive why and the specifics about getting the data to Hadoop. Knowing more about that would help provide alternative solutions for this. Cool?

View solution in original post

0 Karma

Communicator

Hi BUrch,

The goal is to move the data from Splunk to Hadoop/S3 for longer data retention. Currently we store data in Splunk only for two months.
We want to send data to Hadoop and the analytics team wants to further analyse this HIVE , etc
I understand
1) We can simply forward data to another system as it arrives at Splunk
or
2) we can export data after it has been cooked by Splunk.
However not sure how to achieve first part.
Hadoop connect allows data export to Hadoop but it doesnt seem to be supported in a clustered architecture. Thus looking out for options.

I believe Hadoop Data Roll is used when I want to use Splunk again for archived data analytics, which is not needed for us. This archived index will follow the same two month retention period right? -- Please correct me if Im wrong.

0 Karma

Ultra Champion

Thanks for the extra detail. It sounds like Hadoop Data Roll is exactly what you want. It's the ideal way to roll data from Splunk to Hadoop. Yes it will allow you to search it still but you don't have to do that AND it will have DIFFERENT retention than the index did in Splunk. Read through the documentation and I'm confident you'll agree that it's 100% exactly what you want to use.

0 Karma

Communicator

Thakyou for the notes. I will go through the documentation and get back incase of further queries.

0 Karma

Communicator

Hi @SloshBurch ,

I did go through the documentation and had few queries - I have posted these queries at the below link as well:

https://answers.splunk.com/answers/577310/difference-between-data-format-for-the-data-sent-t.html?mi...

  1. Firstly what is the difference between the data format of data that is sent via Hadoop Connect Export and Hadoop Data Roll? I believe Hadoop Connect exports search results and Hadoop Data Roll send the raw data journal.gz
  2. Can I use Hadoop Techniques like Hive, Pig..etc for analytics on the archived data sent to Hadoop via Hadoop Data Roll?
    AND
    Can I use Hadoop Techniques like Hive, Pig..etc for analytics on the archived data sent to Hadoop via Splunk Hadoop Connect Export?

  3. I came across Splunk Archive Bucket Reader - an additional app is required to analyze the archived data via Hadoop's applications like Pig , Hive , Spark.
    Is this a mandatory app required If I want to analyse the Hadoop data?

  4. Is Splunk Analytics for Hadoop license license required for sending archived data to Hadoop?

0 Karma

Ultra Champion

Thanks @saranya_fmr. I was at .conf2017 and then another work trip hence the delayed response. I will respond in the new thread you created. Thanks for starting that.

Also, if we answered your initial question, please make sure to accept that answer so others know if its worth exploring this thread or not.

0 Karma