原因分析

CDH 集群环境没有对 Container分配足够的运行环境(内存)

解决办法

需要修改的配置文件,将具体的配置项修改匹配集群环境资源。如下:
配置文件
配置设置
解释
计算值(参考)
yarn-site.xml
yarn.nodemanager.resource.memory-mb
分配给容器的物理内存数量
= 52 * 2 =104 G
yarn-site.xml
yarn.scheduler.minimum-allocation-mb
容器可以请求的最小物理内存量(以 MiB 为单位)
= 2G
yarn-site.xml
yarn.scheduler.maximum-allocation-mb
为容器请求的最大物理内存数量(以 MiB 为单位)。
= 52 * 2 = 104G
yarn-site.xml (check)
yarn.app.mapreduce.am.resource.mb
ApplicationMaster 的物理内存要求 (MiB)。
= 2 * 2=4G
yarn-site.xml (check)
yarn.app.mapreduce.am.command-opts
传递到 MapReduce ApplicationMaster 的 Java 命令行参数
= 0.8 * 2 * 2=3.2G
yarn-site.xml
yarn.nodemanager.vmem-pmem-ratio
容器内存限制时虚拟内存与物理内存的比率
默认是2.1,根据实际情况调整这个配置项的值
mapred-site.xml
mapreduce.map.memory.mb
为作业的每个 Map 任务分配的物理内存量(MiB)。
= 2G
mapred-site.xml
mapreduce.reduce.memory.mb
为作业的每个 Reduce 任务分配的物理内存量(MiB)。
= 2 * 2=4G
mapred-site.xml
mapreduce.map.java.opts
Map 进程的 Java 选项。
= 0.8 * 2=1.6G
mapred-site.xml
mapreduce.reduce.java.opts
Reduce 进程的 Java 选项。
= 0.8 * 2 * 2=3.2G

异常日志

'PHYSICAL' memory limit. Current usage: 2.1 GB of 2 GB physical memory used; 21.2 GB of 4.2 GB virtual memory used. Killing container.
 
Application application_1543392650432_0855 failed 2 times due to AM Container for appattempt_1543392650432_0855_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: [2018-12-01 14:57:17.762]Container [pid=31682,containerID=container_1543392650432_0855_02_000001] is running 120156160B beyond the 'PHYSICAL' memory limit. Current usage: 2.1 GB of 2 GB physical memory used; 21.2 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_1543392650432_0855_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 1080 31768 31682 31682 (java) 2769 194 3968139264 299128 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dproc_jar -Djava.net.preferIPv4Stack=true -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties -Dyarn.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/lib/native -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop -Dhadoop.id.str=chenweidong -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/jars/hive-exec-2.1.1-cdh6.0.1.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -libjars file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-client.jar,file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/auxlib/hive-exec-2.1.1-cdh6.0.1-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-server.jar,file:/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/auxlib/hive-exec-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.0.1.jar -localtask -plan file:/tmp/yarn/cfb1d927-a086-4b93-af4b-9816f2dc9f49/hive_2018-12-01_14-56-08_101_7346185201514786308-1/-local-10010/plan.xml -jobconffile file:/tmp/yarn/cfb1d927-a086-4b93-af4b-9816f2dc9f49/hive_2018-12-01_14-56-08_101_7346185201514786308-1/-local-10011/jobconf.xml
|- 31682 31680 31682 31682 (bash) 0 0 11960320 344 /bin/bash -c /usr/java/jdk1.8.0_141-cloudera/bin/java -Dlog4j.configuration=container-log4j.properties -Dlog4j.debug=true -Dyarn.app.container.log.dir=/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001 -Dyarn.app.container.log.filesize=1048576 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dsubmitter.user=chenweidong org.apache.oozie.action.hadoop.LauncherAM 1>/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001/stdout 2>/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001/stderr
|- 31689 31682 31682 31682 (java) 355 28 14790037504 76787 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dlog4j.configuration=container-log4j.properties -Dlog4j.debug=true -Dyarn.app.container.log.dir=/yarn/container-logs/application_1543392650432_0855/container_1543392650432_0855_02_000001 -Dyarn.app.container.log.filesize=1048576 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dsubmitter.user=chenweidong org.apache.oozie.action.hadoop.LauncherAM
|- 31768 31756 31682 31682 (java) 1750 114 4003151872 176993 /usr/java/jdk1.8.0_141-cloudera/bin/java -Dproc_jar -Djava.net.preferIPv4Stack=true -Xmx2147483648 -Djava.net.preferIPv4Stack=true -Dlog4j.configurationFile=hive-log4j2.properties -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/bin/../conf/parquet-logging.properties -Dyarn.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/lib/native -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hadoop -Dhadoop.id.str=chenweidong -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/lib/hive-cli-2.1.1-cdh6.0.1.jar org.apache.hadoop.hive.cli.CliDriver --hiveconf hive.query.redaction.rules=/opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/conf/redaction-rules.json --hiveconf hive.exec.query.redactor.hooks=org.cloudera.hadoop.hive.ql.hooks.QueryRedactor --hiveconf hive.aux.jars.path=file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hive/lib/hive-hbase-handler-2.1.1-cdh6.0.1.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/auxlib/hive-exec-2.1.1-cdh6.0.1-core.jar,file:///opt/cloudera/parcels/CDH-6.0.1-1.cdh6.0.1.p0.590678/bin/../lib/hive/auxlib/hive-exec-core.jar -S -v -e
|- 31756 31689 31682 31682 (initialization_) 0 0 11960320 371 /bin/bash ./initialization_data_step2.sh 20181123 20181129 dwp_order_log_process
 
[2018-12-01 14:57:17.770]Container killed on request. Exit code is 143
[2018-12-01 14:57:17.778]Container exited with a non-zero exit code 143.
For more detailed output, check the application tracking page: https://master.prodcdh.com:8090/cluster/app/application_1543392650432_0855 Then click on links to logs of each attempt.
. Failing the application.

引申参考

https://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits

https://yq.aliyun.com/articles/25470

troubleshooting-Container 'PHYSICAL' memory limit的更多相关文章

  1. hadoop的job执行在yarn中内存分配调节————Container [pid=108284,containerID=container_e19_1533108188813_12125_01_000002] is running beyond virtual memory limits. Current usage: 653.1 MB of 2 GB physical memory used

    实际遇到的真实问题,解决方法: 1.调整虚拟内存率yarn.nodemanager.vmem-pmem-ratio (这个hadoop默认是2.1) 2.调整map与reduce的在AM中的大小大于y ...

  2. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  3. spark运行任务报错:Container [...] is running beyond physical memory limits. Current usage: 3.0 GB of 3 GB physical memory used; 5.0 GB of 6.3 GB virtual memory used. Killing container.

    spark版本:1.6.0 scala版本:2.10 报错日志: Application application_1562341921664_2123 failed 2 times due to AM ...

  4. is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.6 GB of 40 GB virtual memory used

    昨天使用hadoop跑五一的数据,发现报错: Container [pid=,containerID=container_1453101066555_4130018_01_000067] GB phy ...

  5. hive: insert数据时Error during job, obtaining debugging information 以及beyond physical memory limits

    insert overwrite table canal_amt1...... 2014-10-09 10:40:27,368 Stage-1 map = 100%, reduce = 32%, Cu ...

  6. 运行hadoop的时候提示物理内存或虚拟内存溢出的解决方案running beyond physical memory或者beyond vitual memory limits

    当运行中出现Container is running beyond physical memory这个问题出现主要是因为物理内存不足导致的,在执行mapreduce的时候,每个map和reduce都有 ...

  7. Hive-Container killed by YARN for exceeding memory limits. 9.2 GB of 9 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.

    Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task times, most recen ...

  8. Min Stack (LeetCode) tweak it to avoid Memory Limit Exceeded

    class MinStack { public: void push(int x) { if(values.empty()) { values.push_back(x); min_indices.pu ...

  9. 1.Zabbix报错信息:It probably means that the systems requires more physical memory.

    点击返回:自学Zabbix之路 1.Zabbix报错信息:It probably means that the systems requires more physical memory. 1.报错信 ...

随机推荐

  1. Ubuntu16.04下编译安装及运行单目ORBSLAM2

    官网有源代码和配置教程,地址是 https://github.com/raulmur/ORB_SLAM2 1 安装必要工具 首先,有两个工具是需要提前安装的.即cmake和Git. sudo apt- ...

  2. POJ 2195 - Going Home - [最小费用最大流][MCMF模板]

    题目链接:http://poj.org/problem?id=2195 Time Limit: 1000MS Memory Limit: 65536K Description On a grid ma ...

  3. CodeForces - 665D Simple Subset 想法题

    //题意:给你n个数(可能有重复),问你最多可以取出多少个数使得任意两个数之和为质数.//题解:以为是个C(2,n)复杂度,结果手摸几组,发现从奇偶性考虑,只有两种情况:有1,可以取出所有的1,并可以 ...

  4. pandas介绍及环境部署

    pandas介绍 Python Data Analysis Library 或 pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的.Pandas 纳入了大量库和一些标准的 ...

  5. Python yield 使用浅析(转)

    add by zhj: 说到yield,就要说说迭代器.生成器.生成器函数. 迭代器:其实就是一个可迭代对象,书上说迭代器,我个人不喜欢这个说法,有点晦涩.可迭代对象基本上可以认为是有__iter__ ...

  6. EControl平台测试向生产版本工程切换说明

    第一步,备份生产环境版本,假设生产环境版本工程名为SEHEControl,记录版本说明第二部,拷贝测试版本到新文件夹,假设测试版本工程名为SEHEControlTest第三步,进入工程文件夹,修改SL ...

  7. office 2016 install(office2016组件自定义安装激活程序) v5.9.3中文绿色版

    下载地址  http://www.ddooo.com/softdown/71741.htm#dltab office 2016 install是目前下载office2016和office2016组件最 ...

  8. linux基础(2)-基础命令和基础特性

    基础命令 命令历史 命令历史的管理 登陆 shell 时,会读取命令历史文件中记录下的命令: ~/.bash_history . 登陆进 shell 后,新执行的命令只会记录在缓存中,这些命令会在用户 ...

  9. vue学习七之Axios

    JQuery时代,我们使用ajax向后台提交数据请求,Vue时代,Axios提供了前端对后台数据请求的各种方式. 什么是Axios Axios是基于Promise的Http客户端,可以在浏览器和nod ...

  10. INT_MAX和INT_MIN注意事项

    版权声明:转载请注明出处 http://blog.csdn.net/TwT520Ly https://blog.csdn.net/TwT520Ly/article/details/53038345 I ...