We are getting this error:
[psb_cloudera] IOException - Error while waiting for MapReduce job to complete, job_id=[!cloudera-node1.ngid.centurylink.net:8088/cluster/app/application_1400017911623_0008 job_1400017911623_0008], state=FAILED, reason=Application application_1400017911623_0008 failed 2 times due to AM Container for appattempt_1400017911623_0008_000002 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException:
Pig, Impala, Hive and everything else works. i can run samples fine, other MR2 jobs work. Can one send a sample of the provider configuration?
I had a similar problem working with a customer in a Cloudera 5.7 environment.
After some digging the customer found the solution:
It's the missing $MR2_CLASSPATH value!!!!!
Kudos to @ohoppe!
If you get errors... look for distribution specific settings or values!
In the same vein of Hunk Virtual Index configuration:
We had an issue where our MR jobs kicked off by Splunk/Hunk were not showing in the Job History UI. We saw the logs written to hdfs:/tmp/splunk... with splunk:superuser as the owner.
For CDH5, you must configure the following in the provider:
Then the logs are written to hdfs:/user/splunk with correct permissions and the Job History server (who runs as mapred) can pick up the logs.
Here are my configs for geting Hunk 6.1 to work with CDH5 VM. There are two particular configs that you need to set ( vix.yarn.application.classpath & vix.mapred.job.map.memory.mb) to get Hunk 6.1 to be able to submit MR/Yarn jobs into the VM.
cat $SPLUNK_HOME/etc/apps/search/local/indexes.conf [provider:CDH5-VM] vix.family = hadoop vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-s6.0-hy2.0.jar vix.env.HADOOP_HOME = /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47 vix.env.JAVA_HOME = /usr/java/jdk1.7.0_45-cloudera vix.fs.default.name = hdfs://localhost.localdomain:8020 vix.splunk.home.hdfs = /user/cloudera/hunk/workdir vix.mapreduce.framework.name = yarn vix.yarn.resourcemanager.address = localhost.localdomain:8032 vix.yarn.resourcemanager.scheduler.address = localhost.localdomain:8030 ### unset this so the default value from Hadoop's xml conf files is used vix.yarn.application.classpath = ### in CDH5 VM the max container size is 1GB, but Hunk's default is 2GB - lower it vix.mapred.job.map.memory.mb = 1024 [test] vix.provider = CDH5-VM vix.input.1.path = /user/cloudera/hunk/data/...
I've also blogged about how to use Hunk to troubleshoot itself and Hadoop while setting it up.
Here are some additional notes when working with Hunk and Hadoop.
Make sure that the hosts file and resolve.conf are configured. For example, the host that has splunk running should have the hadoop hosts in /etc/hosts and resolve.conf and/or configured in DNS. If you see a java connect exception when the job reports back with an address of 0.0.0.0 in then the networking issues need to be corrected. Basically the hostname cannot be resolved.
The hadoop install on the splunk host should match the version of CDH/HortonWorks/MapR.
Verify you can run a hadoop job on the splunk box as a splunk user.
$ /opt/hadoop-2.3.0-cdh5.0.0/bin/hadoop jar /opt/hadoop-2.3.0-cdh5.0.0/share/hadoop/mapreduce2/hadoop-mapreduce-examples-2.3.0-cdh5.0.0.jar pi 10 10
Try and HDFS command from the splunk host (correct the ip address below to yours)
hadoop fs -ls hdfs://22.214.171.124:8020/user/splunk
No that did not work. I got it to work by doing the following for CDH5.
1) go to the following directory
2) Edit the following file
modify the hadoop-config.sh and add a set -x line
3) Run the file to obtain the classpath
4) Cut and paste the classpath that shows up in the last line. Mine looks like this:
5) Now set the vix.yarn.application.classpath to the classpath above
Can you try (un)setting the following value in your provider either in apps/search/local/indexes.conf or through the Virtual Indexe manager pages?
[provider-stanza] .... #unset the value for this conf by setting it to an empty string vix.yarn.application.classpath =
That is a problem we've come across on a couple of CDH versions, however we need to look at the container logs for why it failed to determine the root cause. If the above doesn't work please include the provider config and the logs from the failed container attempts.