YARN Architecture
The fundamental idea of YARN is to split up the functionalities of resource management and job scheduling/monitoring into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job or a DAG of jobs.
The ResourceManager and the NodeManager form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler.
The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.
YARN的基本构想是将资源管理器和作业调度器/监控器分开成两个单独的进程。这个想法是为了拥有一个全局的资源管理器(RM)和每一个应用都有一个应用控制器。应用可以是一个单独的作业也可以是一组作业。
ResourceManager和NodeManager构成数据计算框架。RM是最终的权威仲裁系统中的所有应用的资源分配。NodeManager是框架在每台机器中负责containers的代理,监控它们的资源使用(内存、CPU、磁盘和网络)和将其汇报给ResourceManager/调度器。监控它们的资源使用(内存、CPU、磁盘和网络)和将其汇报给ResourceManager/调度器。
每个应用程序的ApplicationMaster实际上是框架指定的库负责从RM谈判获取资源并和MM一起工作来执行和监控任务。

The ResourceManager has two main components: Scheduler and ApplicationsManager.
The Scheduler is responsible for allocating resources to the various running applications subject to familiar constraints of capacities, queues etc. The Scheduler is pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based the resource requirements of the applications; it does so based on the abstract notion of a resource Container which incorporates elements such as memory, cpu, disk, network etc.
The Scheduler has a pluggable policy which is responsible for partitioning the cluster resources among the various queues, applications etc. The current schedulers such as the CapacityScheduler and the FairScheduler would be some examples of plug-ins.
The ApplicationsManager is responsible for accepting job-submissions, negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, tracking their status and monitoring for progress.
MapReduce in Hadoop-2.x maintains API compatibility with previous stable release (hadoop-1.x). This means that all MapReduce jobs should still run unchanged on top of YARN with just a recompile.
ResourceManager有两个主要的组成部分:调度器和应用管理器。
调度器负责给各个正在运行的拥有相似的约束如容量,队列等的应用分配资源。调度器是一个纯粹的调度器而不负责监控或者跟踪应用的状态。他也不负责恢复由于应用失效或者硬件失效而失败的任务。调度器根据应用的资源需求来执行它的调度。而不是根据一个抽象资源“容器”包含的元素例如内存、CPU、磁盘和网络等
调度器是一个可插拔的组件负责将资源分配给各种各样的队列、应用等。目前的容量调度器和公平调度器将成为一些插件的例子。
应用管理器负责接收作业的提交、选择第一个容器用来运行应用指定的应用控制器和提供当ApplicationMaster容器失效时的重启。每个应用的ApplicationMaster负责从调度器那里谈判获取合适的资源容器,跟踪他们的状态和监控过程。
hadoop-2.x中的MapReduce兼容前面稳定的版本(hadoop-1.x)。这就意味着所有的MapReduce作业只需要再编译一次无需做任何改变就可以运行在YARN上。
YARN Architecture的更多相关文章
- Hadoop官方文档翻译——YARN Architecture(2.7.3)
The fundamental idea of YARN is to split up the functionalities of resource management and job sched ...
- 论文阅读笔记 - YARN : Architecture of Next Generation Apache Hadoop MapReduceFramework
作者:刘旭晖 Raymond 转载请注明出处 Email:colorant at 163.com BLOG:http://blog.csdn.net/colorant/ 更多论文阅读笔记 http:/ ...
- YARN : Architecture of Next Generation Apache Hadoop MapReduceFramework
转自:http://blog.csdn.net/colorant/article/details/9146201 == 目标问题 == 下一代的Hadoop框架,支持10,000+节点规模的Hadoo ...
- Docker on YARN在Hulu的实现
这篇文章是我来Hulu这一年做的主要工作,结合当下流行的两个开源方案Docker和YARN,提供了一套灵活的编程模型,目前支持DAG编程模型,将会支持长服务编程模型. 基于Voidbox,开发者可以很 ...
- <YaRN><Official doc><RM REST API's>
Overview ... YARN Architecture The fundamental idea of YARN is to split up the functionalities of re ...
- YARN结构分析与工作流程
YARN Architecture Link: http://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/YARN.html ...
- Hadoop官方文档翻译——MapReduce Tutorial
MapReduce Tutorial(个人指导) Purpose(目的) Prerequisites(必备条件) Overview(综述) Inputs and Outputs(输入输出) MapRe ...
- HADOOP 2架构图
HDFS 2 architecture YARN architecture
- 大数据入门第七天——MapReduce详解(一)入门与简单示例
一.概述 1.map-reduce是什么 Hadoop MapReduce is a software framework for easily writing applications which ...
随机推荐
- MySql 错误 Err [Imp] 1153 - Got a packet bigger than 'max_allowed_packet' bytes
今天在用Navicat导入SQL文件时报错:MySql 错误 Err [Imp] 1153 - Got a packet bigger than 'max_allowed_packet' bytes ...
- C#调用外部DLL介绍及使用详解
一. DLL与应用程序 动态链接库(也称为DLL,即为“Dynamic Link Library”的缩写)是Microsoft Windows最重要的组成要素之一,打开Windows系统文件 ...
- 页面倒计时跳转页面效果,js倒计时效果
页面倒计时跳转页面效果,js倒计时效果 >>>>>>>>>>>>>>>>>>>> ...
- nginx_lua_waf 部署、测试记录
ngx_lua_waf ngx_lua_waf是一个基于lua-nginx-module(openresty)的web应用防火墙 源码:https://github.com/loveshell/ngx ...
- Windows Server 2008 R2 WSUS服务器的详细配置和部署
WSUS客户端配置 我们要让客户端计算机能够通过WSUS服务器下载更新程序,而这个设置在域环境和单台PC是的方法不同,这里介绍一下本地计算机如何进行设置. 1.开始--运行--输入gpedit.msc ...
- psutil的使用
psutil是Python中广泛使用的开源项目,其提供了非常多的便利函数来获取操作系统的信息. 此外,还提供了许多命令行工具提供的功能,如ps,top,kill.free,iostat,iotop,p ...
- Xcode 7.3 cannot create __weak reference in file using manual reference counting
原帖地址 http://stackoverflow.com/questions/36147625/xcode-7-3-cannot-create-weak-reference-in-file-us ...
- 《C++ Primer Plus》第17章 输入、输出和文件 学习笔记
流是进出程序的字节流.缓冲区是内存中的临时存储区域,是程序与文件或其他I/O设备之间的桥梁.信息在缓冲区和文件之间传输时,将使用设备(如磁盘驱动器)处理效率最高的尺寸以大块数据的方式进行传输.信息在缓 ...
- 【LeetCode OJ】Median of Two Sorted Arrays
题目链接:https://leetcode.com/problems/median-of-two-sorted-arrays/ 题目:There are two sorted arrays nums1 ...
- 简述项目中优化sql的方法,从哪些方面,sql语句性能如何分析?
查询速度慢的原因很多,常见如下几种 : .没有索引或者没有用到索引(这是查询慢最常见的问题,是程序设计的缺陷) .I/O吞吐量小,形成了瓶颈效应. .没有创建计算列导致查询不优化. .内存不足 .网络 ...