<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Yarn Container exit code 143 in All Apps and Add-ons</title>
    <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317546#M38020</link>
    <description>&lt;P&gt;It looks as if you have some memory issues in the Hadoop nodes, so some of the jobs are being killed. &lt;/P&gt;</description>
    <pubDate>Tue, 11 Apr 2017 00:43:36 GMT</pubDate>
    <dc:creator>rdagan_splunk</dc:creator>
    <dc:date>2017-04-11T00:43:36Z</dc:date>
    <item>
      <title>Yarn Container exit code 143</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317541#M38015</link>
      <description>&lt;P&gt;We are running a 10-datanode Hortonworks HDP v2.5 cluster on Ubuntu 14.04. Whenever I run a large yarn job he map task shows as &lt;STRONG&gt;SUCCEEDED&lt;/STRONG&gt; but with a Note "&lt;STRONG&gt;Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143&lt;/STRONG&gt;"&lt;/P&gt;

&lt;P&gt;Can someone help me troubleshoot this?&lt;/P&gt;

&lt;P&gt;&lt;IMG src="http://i.imgur.com/CdUazDr.png" alt="alt text" /&gt;&lt;/P&gt;

&lt;P&gt;yarn-yarn-nodemanager-datanode.log&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2017-04-03 10:15:18,140 INFO  containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(810)) - Start request for container_e10_1484675915702_18333_01_000003 by user root
2017-04-03 10:15:18,151 INFO  application.ApplicationImpl (ApplicationImpl.java:transition(304)) - Adding container_e10_1484675915702_18333_01_000003 to application application_1484675915702_18333
2017-04-03 10:15:18,153 INFO  container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e10_1484675915702_18333_01_000003 transitioned from NEW to LOCALIZING
2017-04-03 10:15:18,157 INFO  yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(184)) - Initializing container container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:18,157 INFO  yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(185)) - Initializing container container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:18,358 INFO  localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(712)) - Created localizer for container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:18,406 INFO  localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(1194)) - Writing credentials to the nmPrivate file /grid/3/hadoop/yarn/local/nmPrivate/container_e10_1484675915702_18333_01_000003.tokens. Credentials list: 
2017-04-03 10:15:18,407 INFO  container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e10_1484675915702_18333_01_000003 transitioned from LOCALIZING to LOCALIZED
2017-04-03 10:15:18,458 INFO  container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e10_1484675915702_18333_01_000003 transitioned from LOCALIZED to RUNNING
2017-04-03 10:15:18,462 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:buildCommandExecutor(281)) - launchContainer: [bash, /grid/1/hadoop/yarn/local/usercache/root/appcache/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003/default_container_executor.sh]
2017-04-03 10:15:18,465 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(126)) - Copying from /grid/3/hadoop/yarn/local/nmPrivate/container_e10_1484675915702_18333_01_000003.tokens to /grid/2/hadoop/yarn/local/usercache/root/appcache/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003.tokens
2017-04-03 10:15:20,998 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(375)) - Starting resource-monitoring for container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:21,144 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(464)) - Memory usage of ProcessTree 851 for container-id container_e10_1484675915702_18333_01_000003: 148.7 MB of 2 GB physical memory used; 2.1 GB of 4.2 GB virtual memory used
2017-04-03 10:15:24,293 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(464)) - Memory usage of ProcessTree 851 for container-id container_e10_1484675915702_18333_01_000003: 305.4 MB of 2 GB physical memory used; 2.4 GB of 4.2 GB virtual memory used
2017-04-03 10:15:24,734 INFO  containermanager.ContainerManagerImpl (ContainerManagerImpl.java:stopContainerInternal(960)) - Stopping container with container Id: container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,734 INFO  container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e10_1484675915702_18333_01_000003 transitioned from RUNNING to KILLING
2017-04-03 10:15:24,734 INFO  launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(425)) - Cleaning up container container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,743 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(237)) - Exit code from container container_e10_1484675915702_18333_01_000003 is : 143
2017-04-03 10:15:24,756 INFO  container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e10_1484675915702_18333_01_000003 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL
2017-04-03 10:15:24,757 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(480)) - Deleting absolute path : /grid/1/hadoop/yarn/local/usercache/root/appcache/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,757 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(480)) - Deleting absolute path : /grid/2/hadoop/yarn/local/usercache/root/appcache/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,757 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(480)) - Deleting absolute path : /grid/3/hadoop/yarn/local/usercache/root/appcache/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,757 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(480)) - Deleting absolute path : /grid/0/hadoop/yarn/local/usercache/root/appcache/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,757 INFO  container.ContainerImpl (ContainerImpl.java:handle(1163)) - Container container_e10_1484675915702_18333_01_000003 transitioned from CONTAINER_CLEANEDUP_AFTER_KILL to DONE
2017-04-03 10:15:24,757 INFO  application.ApplicationImpl (ApplicationImpl.java:transition(347)) - Removing container_e10_1484675915702_18333_01_000003 from application application_1484675915702_18333
2017-04-03 10:15:24,757 INFO  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:startContainerLogAggregation(512)) - Considering container container_e10_1484675915702_18333_01_000003 for log-aggregation
2017-04-03 10:15:24,758 INFO  yarn.YarnShuffleService (YarnShuffleService.java:stopContainer(190)) - Stopping container container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:24,758 INFO  yarn.YarnShuffleService (YarnShuffleService.java:stopContainer(191)) - Stopping container container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:26,338 INFO  nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(553)) - Removed completed containers from NM context: [container_e10_1484675915702_18333_01_000003]
2017-04-03 10:15:27,294 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(390)) - Stopping resource-monitoring for container_e10_1484675915702_18333_01_000003
2017-04-03 10:15:34,491 INFO  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:doContainerLogAggregation(567)) - Uploading logs for container container_e10_1484675915702_18333_01_000003. Current good log dirs are /grid/1/hadoop/yarn/log,/grid/2/hadoop/yarn/log,/grid/3/hadoop/yarn/log,/grid/0/hadoop/yarn/log
2017-04-03 10:15:34,495 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(489)) - Deleting path : /grid/1/hadoop/yarn/log/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003/syslog
2017-04-03 10:15:34,496 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(489)) - Deleting path : /grid/1/hadoop/yarn/log/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003/directory.info
2017-04-03 10:15:34,496 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(489)) - Deleting path : /grid/1/hadoop/yarn/log/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003/stdout
2017-04-03 10:15:34,496 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(489)) - Deleting path : /grid/1/hadoop/yarn/log/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003/stderr
2017-04-03 10:15:34,496 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:deleteAsUser(489)) - Deleting path : /grid/1/hadoop/yarn/log/application_1484675915702_18333/container_e10_1484675915702_18333_01_000003/launch_container.sh
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 10 Apr 2017 20:19:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317541#M38015</guid>
      <dc:creator>suarezry</dc:creator>
      <dc:date>2017-04-10T20:19:36Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Container exit code 143</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317542#M38016</link>
      <description>&lt;P&gt;Is this a Splunk problem, or is it that you are using Splunk to detect the problem?&lt;/P&gt;

&lt;P&gt;If it is actually a hadoop/yarn problem, then this is not the forum for that question - although it is possible that someone here might know the answer...&lt;/P&gt;</description>
      <pubDate>Mon, 10 Apr 2017 20:24:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317542#M38016</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2017-04-10T20:24:49Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Container exit code 143</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317543#M38017</link>
      <description>&lt;P&gt;Yes, it is a splunk problem.  I am running a hadoop search using Splunk Analytics for Hadoop and I am getting this problem.  I would like some suggestions on how to troubleshoot this.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Apr 2017 20:28:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317543#M38017</guid>
      <dc:creator>suarezry</dc:creator>
      <dc:date>2017-04-10T20:28:45Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Container exit code 143</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317544#M38018</link>
      <description>&lt;P&gt;Ah - that's helpful to know. Have you looked at the splunkd.log for any error messages?&lt;/P&gt;</description>
      <pubDate>Mon, 10 Apr 2017 20:56:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317544#M38018</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2017-04-10T20:56:45Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Container exit code 143</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317545#M38019</link>
      <description>&lt;P&gt;There is nothing sticking out in the search.log:&lt;BR /&gt;
&lt;A href="https://pastebin.com/rmBTBFcG"&gt;https://pastebin.com/rmBTBFcG&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 10 Apr 2017 22:04:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317545#M38019</guid>
      <dc:creator>suarezry</dc:creator>
      <dc:date>2017-04-10T22:04:05Z</dc:date>
    </item>
    <item>
      <title>Re: Yarn Container exit code 143</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317546#M38020</link>
      <description>&lt;P&gt;It looks as if you have some memory issues in the Hadoop nodes, so some of the jobs are being killed. &lt;/P&gt;</description>
      <pubDate>Tue, 11 Apr 2017 00:43:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Yarn-Container-exit-code-143/m-p/317546#M38020</guid>
      <dc:creator>rdagan_splunk</dc:creator>
      <dc:date>2017-04-11T00:43:36Z</dc:date>
    </item>
  </channel>
</rss>

