I'm having trouble getting the Hunk App for AWS ELB working. The error shown in the Hunk App is:
[elb_log_provider] Error while running external process, return_code=255. See search.log for more info
[elb_log_provider] RuntimeException - Failed to create a virtual index filesystem connection: Server IPC version 9 cannot communicate with client version 4. Advice: Verify that your vix.fs.default.name is correct and available.
Here's what I've done so far:
Logged into the splunk
app at http://{public ip}:8000/ and
changed my password. SCP
splunk-6.0.2-196940-Linux-x86_64.tgz
to /opt/splunk_packages Logged into
the instance and updated splunk as
follows:
sudo /opt/splunk/bin/splunk stop
sudo tar -xvzf splunk-6.0.2-196940-Linux-x86_64.tgz -C /opt
sudo /opt/splunk/bin/splunk start --accept-license --answer-yes
Logged into splunk app at http://{public
ip}:8000/ and uploaded the Hunk app from
a file.
Updated the Hunk app per the in-app docs.
In my Amazon security groups
for the EMR master and slave, I
opened up all TCP, UDP, and ICMP from
the Hunk instance to the EMR
instances.
Can someone please advise on the next steps to troubleshoot?
Thanks!
This seems to be like a Hadoop client library version mismatch - the version numbers reported in the error message indicate that the EMR cluster is running Hadoop 2.x while Hunk is trying to use Hadoop 1.x client libraries to communicate with it.
Can you please first confirm the EMR version and Hadoop client library version in Hunk before proceeding any further?
If you need additional help you may want to look at this video - Hunk by the Hour: http://aws.amazon.com/elasticmapreduce/hunk/
Hadoop Home = Yes. Change it to your actual Hadoop Home on the client (Hunk Node)
Working Dir = Yes. Change it to something like /user/
Job Queue = No. The default is good enough unless you have a Multi User environment in Hadoop
Okay thanks. I'm still trying to get things configured right. I'll let you know if I have any problems. You guys are really responsive. Thanks!
Hi. I'm trying to get Hunk setup and I'm looking at the configs under settings, virtual indexes. This is my first time dealing with Hadoop and HDFS, so I'll probably be asking some silly questions.
I would like to follow the instructions in the blog post with Hunk.
http://docs.splunk.com/Documentation/Hunk/latest/Hunk/InstallHunkAWSwithEMR
Some questions:
Do I need to change the path to Hadoop Home? /opt/hadoop/apache/hadoop-1.0.3
Do I need to change working dir? /hunk/working-dir
Do I need to change job queue? default
Thanks!
Hadoop was set up according to the tutorial on Amazon titled: Tutorial: Query Elastic Load Balancing Access Logs with Amazon Elastic MapReduce
That tutorial selects Hadoop 2.2.0.
Thanks for the responses, but we gave up. There were too many undocumented steps to getting the Hunk AMI running. Or possibly more accurate: the documentation is spread across too many sources for someone unfamiliar with Splunk and Hunk to have a good chance of success.
Sure thing. The March, 2014 email and blog post from Amazon announcing Elastic Load Balancing Access Logs has both a link to Amazon's tutorial for setting up EMR for your ELB logs and a link to a Splunk blog post about the Hunk App. We tried following one and then the other, but it looks like those two sets of instructions are incompatible at the moment.
Thanks for your honest feedback! We'll certainly look into improving the first time run experience. We always welcome any suggestions you might have.
This seems to be like a Hadoop client library version mismatch - the version numbers reported in the error message indicate that the EMR cluster is running Hadoop 2.x while Hunk is trying to use Hadoop 1.x client libraries to communicate with it.
Can you please first confirm the EMR version and Hadoop client library version in Hunk before proceeding any further?
It looks as if you are trying to configure this application with Splunk and not with Hunk. Are you able to connect to AWS using Hunk?