Archive

Hunk App for AWS ELB - Setup

jtaylor_ticketb
Explorer

I'm having trouble getting the Hunk App for AWS ELB working. The error shown in the Hunk App is:

[elb_log_provider] Error while running external process, return_code=255. See search.log for more info
[elb_log_provider] RuntimeException - Failed to create a virtual index filesystem connection: Server IPC version 9 cannot communicate with client version 4. Advice: Verify that your vix.fs.default.name is correct and available.

Here's what I've done so far:

  • Created by EMR cluster for analyzing ELB Access logs exactly per Amazon's tutorial.
  • Launched an instance of the Hunk AMI from the Amazon Marketplace.
  • Logged into the splunk
    app at http://{public ip}:8000/ and
    changed my password. SCP
    splunk-6.0.2-196940-Linux-x86_64.tgz
    to /opt/splunk_packages Logged into
    the instance and updated splunk as
    follows:

    sudo /opt/splunk/bin/splunk stop
    sudo tar -xvzf splunk-6.0.2-196940-Linux-x86_64.tgz -C /opt
    sudo /opt/splunk/bin/splunk start --accept-license --answer-yes
    
  • Logged into splunk app at http://{public
    ip}:8000/ and uploaded the Hunk app from
    a file.

  • Updated the Hunk app per the in-app docs.

    • On This step I was undlear on the part to "Change the HDFS working directory attribute to reflect a read/writable HDFS path in your cluster." So it's likely I didn't get this correct. I created /hunk/working-dir/ on the local file system of all EMR nodes.
  • In my Amazon security groups
    for the EMR master and slave, I
    opened up all TCP, UDP, and ICMP from
    the Hunk instance to the EMR
    instances.

Can someone please advise on the next steps to troubleshoot?

Thanks!

1 Solution

Ledion_Bitincka
Splunk Employee
Splunk Employee

This seems to be like a Hadoop client library version mismatch - the version numbers reported in the error message indicate that the EMR cluster is running Hadoop 2.x while Hunk is trying to use Hadoop 1.x client libraries to communicate with it.

Can you please first confirm the EMR version and Hadoop client library version in Hunk before proceeding any further?

View solution in original post

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

If you need additional help you may want to look at this video - Hunk by the Hour: http://aws.amazon.com/elasticmapreduce/hunk/

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

Hadoop Home = Yes. Change it to your actual Hadoop Home on the client (Hunk Node)
Working Dir = Yes. Change it to something like /user/
Job Queue = No. The default is good enough unless you have a Multi User environment in Hadoop

0 Karma

albertho
New Member

Okay thanks. I'm still trying to get things configured right. I'll let you know if I have any problems. You guys are really responsive. Thanks!

0 Karma

albertho
New Member

Hi. I'm trying to get Hunk setup and I'm looking at the configs under settings, virtual indexes. This is my first time dealing with Hadoop and HDFS, so I'll probably be asking some silly questions.

I would like to follow the instructions in the blog post with Hunk.
http://docs.splunk.com/Documentation/Hunk/latest/Hunk/InstallHunkAWSwithEMR

Some questions:
Do I need to change the path to Hadoop Home? /opt/hadoop/apache/hadoop-1.0.3
Do I need to change working dir? /hunk/working-dir
Do I need to change job queue? default

Thanks!

0 Karma

jtaylor_ticketb
Explorer

Hadoop was set up according to the tutorial on Amazon titled: Tutorial: Query Elastic Load Balancing Access Logs with Amazon Elastic MapReduce

That tutorial selects Hadoop 2.2.0.

Thanks for the responses, but we gave up. There were too many undocumented steps to getting the Hunk AMI running. Or possibly more accurate: the documentation is spread across too many sources for someone unfamiliar with Splunk and Hunk to have a good chance of success.

0 Karma

jtaylor_ticketb
Explorer

Sure thing. The March, 2014 email and blog post from Amazon announcing Elastic Load Balancing Access Logs has both a link to Amazon's tutorial for setting up EMR for your ELB logs and a link to a Splunk blog post about the Hunk App. We tried following one and then the other, but it looks like those two sets of instructions are incompatible at the moment.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

Thanks for your honest feedback! We'll certainly look into improving the first time run experience. We always welcome any suggestions you might have.

0 Karma

Ledion_Bitincka
Splunk Employee
Splunk Employee

This seems to be like a Hadoop client library version mismatch - the version numbers reported in the error message indicate that the EMR cluster is running Hadoop 2.x while Hunk is trying to use Hadoop 1.x client libraries to communicate with it.

Can you please first confirm the EMR version and Hadoop client library version in Hunk before proceeding any further?

View solution in original post

0 Karma

rdagan_splunk
Splunk Employee
Splunk Employee

It looks as if you are trying to configure this application with Splunk and not with Hunk. Are you able to connect to AWS using Hunk?

0 Karma