All Apps and Add-ons

Splunk Analytics for Hadoop & AWS EMR: java.io.IOException: Error while waiting for MapReduce job to complete

pkeenan87
Communicator

I have setup Splunk Analytics for Hadoop and configured it to use AWS EMR to search data in S3. The streaming searches work fine but when I start to run a MR job I get the following error:


Exception - java.io.IOException: Error while waiting for MapReduce job to complete, job_id=job_1540315856620_0642, state=FAILED, reason=Task failed task_1540315856620_0642_m_000002

0 Karma
1 Solution

pkeenan87
Communicator

The issue ended up being that I had Data Model Acceleration turned on and it was running across my virtual indices. I had to modify the cim_* macros to prevent Splunk from executing MR jobs on my hadoop cluster every 5 minutes (the cluster was very small as I was just testing). Once I did that, searches started working fine

View solution in original post

pkeenan87
Communicator

The issue ended up being that I had Data Model Acceleration turned on and it was running across my virtual indices. I had to modify the cim_* macros to prevent Splunk from executing MR jobs on my hadoop cluster every 5 minutes (the cluster was very small as I was just testing). Once I did that, searches started working fine

richgalloway
SplunkTrust
SplunkTrust

@pkeenan87 If your problem is resolved, please accept the answer to help future readers.

---
If this reply helps you, Karma would be appreciated.
0 Karma

sduff_splunk
Splunk Employee
Splunk Employee

Have you confirmed that you can run Map Reduce jobs from the search head via the command-line, i.e., not using Splunk.

Confirm that you have Hadoop properly configured. My customer was missing the yarn-site.xml and core-site.xml from the /opt/hadoop/etc/hadoop/ directories, and their deployment exhibited the same issue as the one you have.

http://docs.splunk.com/Documentation/HadoopConnect/1.2.5/DeployHadoopConnect/Setupcompressedfiletype...

Also refer to the following question on stack overflow, https://stackoverflow.com/questions/43425678/application-failed-2-times-due-to-am-container-exited-w... . Again, check that hadoop actually works from the CLI, check all CLASSPATH and PATH variables are correct, using the splunk user.

0 Karma

pkeenan87
Communicator

Here is the output from search.log:

10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Caused by: java.io.IOException: Error while waiting for MapReduce job to complete, job_id=[!http://ip-172-29-24-220.ec2.internal:8088/cluster/app/application_1540315856620_1358 job_1540315856620_1358], state=FAILED, reason=Application application_1540315856620_1358 failed 2 times due to AM Container for appattempt_1540315856620_1358_000002 exited with exitCode: 1
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - For more detailed output, check application tracking page:http://ip-172-29-24-220.ec2.internal:8088/cluster/app/application_1540315856620_1358Then, click on links to logs of each attempt.
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Diagnostics: Exception from container-launch.
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Container id: container_1540315856620_1358_02_000001
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Exit code: 1
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Stack trace: ExitCodeException exitCode=1:
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at org.apache.hadoop.util.Shell.run(Shell.java:479)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at java.util.concurrent.FutureTask.run(FutureTask.java:266)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at java.lang.Thread.run(Thread.java:748)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Container exited with a non-zero exit code 1
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - Failing this attempt. Failing the application.
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at com.splunk.mr.JobSubmitter.waitForCurrentJobToComplete(JobSubmitter.java:534)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at com.splunk.mr.JobSubmitter.startJobImpl(JobSubmitter.java:647)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - at com.splunk.mr.JobSubmitter.startJob(JobSubmitter.java:759)
10-24-2018 09:16:18.799 ERROR ERP.test_hadoop - ... 10 more
10-24-2018 09:16:18.799 INFO ERP.test_hadoop - SplunkMR - finishing, version=6.2 ...

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...