这篇文章是我来Hulu这一年做的主要工作,结合当下流行的两个开源方案Docker和YARN,提供了一套灵活的编程模型,目前支持DAG编程模型,将会支持长服务编程模型。

基于Voidbox,开发者可以很容易的写出一个分布式的框架,Docker作为运行的执行引擎,YARN作为集群资源的管理系统。

同时这篇文章也发表在Hulu官方的技术博客上:http://tech.hulu.com/blog/2015/08/06/voidbox-docker-on-yarn/

csdn在线:http://huiyi.csdn.net/activity/closed?project_id=2332

1. Voidbox Motivation

YARN is the distributed resource management system in Hadoop 2.0, which is able to schedule cluster resources for diverse high-level applications such as MapReduce, Spark. However, nowadays, all existing framework on top of YARN are designed with assumption of specific system environment. How to support user applications with arbitrary complex environment dependencies is still an open question. Docker gives the answer.

Docker is a very popular container virtualization technology. It provides a way to run almost any application isolated in a container. Docker is an open platform for developing, shipping, and running applications. Docker automates the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere.

In order to integrate the unique advantages of Docker and YARN, the Hulu engineering team developed Voidbox. Voidbox enables any application encapsulated in docker image running on YARN cluster along with MapReduce and Spark. Voidbox brings the following benefits:

  • Ease creating distributed application

    • Voidbox handles most common issues in distributed computation system, say it, cluster discovery, elastic resource allocation, task coordination, disaster recovery. With its well-designed interface, it’s easy to implement a distributed application.
  • Simplify deployment
    • Without Voidbox, we need to create and maintain dedicated VM for application with complex environment even though the VM image is huge and not easy to deploy. With Voidbox, we could easily get resource allocated and make app run right the time we need it. Additional maintenance work is eliminated.
  • Improve cluster efficiency
    • As we could deploy Spark/MR and all kinds of Voidbox applications from different department together, we could maximize cluster usage.

Thus, YARN as a big data operating platform has been further consolidated and enhanced.

Voidbox supports Docker container-based DAG(Directed Acyclic Graph) tasks in execution. Moreover, Voidbox provides several ways to submit applications considering demands of the production environment and the debugging environment. In addition, Voidbox can cooperate with Jenkins, GitLab and private Docker Registry to set up a set of developing, testing, automatic release process.

2.Voidbox Architecture

2.1 YARN Architecture Overview

YARN enables multiple applications to share resources dynamically in a cluster. Here is the architecture of applications running in YARN cluster:

Figure 1. YARN Architecture

As shown in figure 1, a client submits a job to Resource Manager. The Resource Manager performs its scheduling function according to the resource requirements of the application. Application Master is responsible for the application tasks scheduling and execution of an application’s lifecycle.

Functionality of each modules:

  • Resource Manager: Responsible for resource management and scheduling in cluster.
  • NodeManager: Running on the compute nodes in cluster, taking care of task execution in the individual machine, collecting informations and keeping heartbeat with Resource Manager.
  • Application Master: Takes care of requesting resources from YARN, then allocates resources to run tasks in Container.
  • Container: Container is an abstract notion which incorporates elements such as memory, cpu, disk, network etc.
  • HDFS: Distributed file system in YARN cluster.

2.2 Voidbox Architecture Design

In Voidbox architecture, YARN is responsible for the cluster’s resource management. Docker acts as the task execution engine above of the operating system, cooperating with Docker Registry. Voidbox helps to translate user programming code into Docker container-based DAG tasks, apply for resources according to requirements and deal with DAG in execution.

Figure 2. Voidbox Architecture

As shown in figure 2, each box stands for one machine with several modules running inside. To make the architecture more clearly, we divide them into three parts, and functionality of Voidbox modules and Docker modules:

  • Voidbox Modules:

    • Voidbox Client: The client program. Through Voidbox Client, users can submit a Voidbox application, stop it, and so on. By the way, Voidbox application contains several Docker jobs and a Docker job contains one or more Docker tasks.
    • Voidbox Master: Actually, it’s an application master in YARN, and takes care of requesting resources from YARN, then allocates resources to Docker tasks.
    • Voidbox Driver: Responsible for task scheduling of a single Voidbox application. Voidbox supports Docker container-based DAG task scheduling and between tasks we can insert some other codes. So Voidbox Driver should handle the order scheduling of DAG task dependencies and execute the user’s code.
    • Voidbox Proxy: The bridge between YARN and Docker engine, responsible for transiting commands from YARN to Docker engine, such as start or kill Docker container, etc.
    • State Server: Maintaining the informations of Docker engine’s health status, providing the list of machines which can run Docker container. So Voidbox Master can apply for resources more efficiently.
  • Docker Modules:
    • Docker Registry: Docker image storage, acting as an internal version control tool of Docker image.
    • Docker Engine: Docker container execution engine, obtaining specified Docker image from Docker Registry and launching Docker container.
    • Jenkins: Cooperating with GitLab, when application codes update, Jenkins will take care of automated testing, packaging, generating the Docker image and uploading to Docker Registry, to complete the application automatically release process.

2.3 Running Mode

Voidbox provides two application running modes: yarn-cluster mode and yarn-client mode.

In yarn-cluster mode, the control component and resource management component are running in the YARN cluster. After we submit the Voidbox application, Voidbox Client can quit at any time without affecting the running time of application. It’s for the production environment.

In yarn-client mode, the control component is running in Voidbox Client, and other components are in the cluster. Users can see much more detailed logs about the application’s status. When Voidbox Client quits, the application in cluster will exit too. So it’s more convenient for debugging.

Here we briefly introduce the implementation architecture of the two modes:

  • yarn-cluster mode

Figure 3. yarn-cluster mode

As shown in figure 3, Voidbox Master and Voidbox Driver are both running in the cluster. Voidbox Driver is responsible for controlling the logic and Voidbox Master takes care of application resource management.

  • yarn-client mode

Figure 4. yarn-client mode

As shown in figure 4, Voidbox Master is running in the cluster, and Voidbox Driver is running in Voidbox Client. Users can submit Voidbox application in IDE for debugging.

2.4 Running Procedure

Here are the procedures of submitting a Voidbox application and its lifecycle:

  1. Users write a Voidbox application by Voidbox SDK and generate a java archive, then submit it to the YARN cluster by Voidbox Client;
  2. After receiving Voidbox application, Resource Manager will allocate resources for Voidbox Master, then launch it.
  3. Voidbox Master starts Voidbox Driver, the latter will decompose Voidbox application into several Docker jobs(a job contains one or more Docker tasks). Voidbox Driver calls Voidbox Master interface to launch the Docker tasks in compute nodes.
  4. Voidbox Master requests resources from Resource Manager, and Resource Manager allocates some YARN containers according to the YARN cluster status. Voidbox Master launches Voidbox Proxy in YARN container, and the latter is responsible for communication with Docker engine to start the Docker container.
  5. User’s Docker task is running in Docker container, and the log output to a local file. User can see real-time application logs through YARN Web Portal.
  6. After all Docker tasks are done, the logs will be aggregate to HDFS, so user still can get the application logs by history server.

2.5 Docker integrating with YARN in resource management

YARN acts as a uniform resource manager in the cluster, and is responsible for resource management on all machines. Docker as a container engine also has the function of resource management. So how to integrate their resource management function is particularly important.

In YARN, the user task can only run in the YARN container, while Docker container can only be handled by Docker engine. This case would get out of the management of YARN and damage the unified management and scheduling principle of YARN, which could produce resource leaks risk issue. In order to enable YARN to manage and schedule Docker container, we need to build a proxy layer between YARN and Docker engine. This is why Docker Proxy is introduced. Through Voidbox Proxy, YARN can manage the container lifecycle including start, stop, etc.

In order to understand Voidbox Proxy more clearly, we take stopping Voidbox application as an example. When a user needs to kill Voidbox application, YARN will recycle all the resources of the application. At this point, YARN will send a kill signal to the related machines. The corresponding Voidbox Proxy will catch the kill signal, then stop Docker container in Docker engine to do the resource recycling. So with the help of Voidbox Proxy, it can not only stop YARN container, but also stop the Docker container to avoid resources leaks issue(This is the problem existing in open source version, see YARN-1964).

3. Fault Tolerance

Although Docker has some stable releases, the enterprise production environment has a variety versions of operating system or kernel, so it brings unstable factors. We consider multiple levels in Voidbox fault-tolerant design to ensure Voidbox’s high availability.

  • Voidbox Master fault tolerance

    • If Resource Manager finds Voidbox Master crashes, it will notify NodeManager to recycle all the YARN containers belonging to this Voidbox application, then restart Voidbox Master.
  • Voidbox Proxy fault tolerance
    • If Voidbox Master finds Voidbox Proxy crashes, it will recycle Docker containers on behalf of Voidbox Proxy.
  • Docker container fault tolerance
    • Each Voidbox application can configure the maximum retry times on failure, when the Docker container crashes, Voidbox Master will do some work according to the exit code of Docker container.

4. Programming model

4.1 DAG Programming model

Voidbox Provides Docker container-based DAG programming model. A sample would look similar to this:

Figure 5. Docker container-based DAG programming model

As shown in figure 5, there are four jobs in this Voidbox application, and each job can configure its requirements of CPU, Memory, Docker image, parallelism and so on. Job3 will start when job1 and job2 both complete. Job1, job2 and job3 make a stage, so user can insert their codes after this stage is done, and finally start running job4.

4.2 Shell mode to submit one task

In most cases, we would like to run a single Docker container-based task without programming. So Voidbox supports shell mode to describe and submit the Docker container-based task, actually it’s a implementation based on DAG programming mode.The example usage of Voidbox in shell mode:

docker-submit.sh \

-docker_image centos \

-shell_command “echo Hello Voidbox” \

-container_memory 1000 \

-cpu_shares 2

The shell script above will submit a task to run “echo Hello Voidbox” in a docker image named ‘centos’, and the resource requirement is 1000Mb memory, 2 cpu virtual cores. 

5. Voidbox in Action

At present we can run Docker, MapReduce, Spark and other applications in YARN cluster. There has been lots of short tasks using Voidbox within HULU.

  • Automation testing process

    • Cooperating with Jenkins, GitLab and private Docker registry, when the application codes update, Jenkins will complete automatic test, package program, regenerate Docker image and push it to the private Docker Registry. It’s a process of development, testing and automatically release.
  • Complex tasks in parallel
    • Test Framework is used to do some testings to detect the availability of some components. The project is implemented by Ruby/Java and has complex dependencies. So we maintain two layers of Docker image, the first layer is the system software as a base image, and the second layer is the business level. We publish a test framework Docker image and use some timing scheduling software to start Voidbox application regularly. Thanks to Voidbox, we solve the issues such as the complex dependencies and the multitasking parallelism.
    • Facematch(link:http://tech.hulu.com/blog/2014/05/03/face-match-system-overview/) is a video analysis application. It’s implemented by C and has lots of graphics libraries. That can be optimized by Voidbox: first of all we need to package all face match program into a Docker image, then write Voidbox application to handle the multiple videos. Voidbox solves the complex machine environment and the parallelism control problem.
  • Building complex workflow
    • Some tasks have a dependent with each other, such as it needs to load user behaviors first, then do the analysis of user behaviors. These two steps have successively dependencies. We use Voidbox container-based programming model to handle this case easily.

6. Different from DockerContainerExecutor in YARN 2.6.0

  • DockerContainerExecutor(link:https://issues.apache.org/jira/browse/YARN-1964) is released in YARN 2.6.0 and it’s alpha version. Not mature enough, and it is only an encapsulation layer above the default executor.
  • DockerContainerExecutor is difficult to coexist with other ContainerExecutor in one YARN cluster.
  • Voidbox features
    • DAG programming model
    • Configurable container level of fault tolerance
    • A variety of running modes, considering development environment and production environment
    • Share YARN cluster resources with other Hadoop job
    • Graphical log view tool

7. Future work

  • Support more versions of YARN

    • Voidbox would like to support more versions in the future besides YARN 2.6.0.
  • Voidbox Master fault tolerance, persistent metadata to reduce the cost in case of retry
    • Currently, if a Voidbox Master crashes, YARN will recycle resources belonging to this Voidbox application and restart Voidbox Master to do some tasks from the very beginning. It’s not necessary to impact tasks which are already done or running. We might keep some metadatas in the State Server to reduce the cost in case of Voidbox Master on-failure.
  • Voidbox Master as a permanent service
    • Voidbox will support long running Voidbox Master to receive streaming tasks.
  • Support long service
    • Voidbox will support long running service if Voidbox Master’s downtime doesn’t influence running task.

Docker on YARN在Hulu的实现的更多相关文章

  1. 【翻译】Voidbox: Docker on YARN

    原文链接:Voidbox – Docker on YARN 读了此文,收获良多,翻译之,方便以后查看~ 文章介绍了Hulu北京大数据团队开发的Docker On YARN实现:Voidbox,一种基于 ...

  2. docker 与 yarn

    有时我们的项目是使用yarn去发布的,当需要使用docker发布这个项目时,安装yarn是必须的,但是平时使用的npm install -g yarn此时却不可用 从网站上找到解决的方法 地址:htt ...

  3. Vagrant Docker Composer Yarn 国外资源下载慢或失败的问题

    1 问题 有时,我们请求国外资源时,下载巨慢,甚至失败.如: cd vue-devtools/ $ yarn install 进行到 cypress.... 时,可能失败. 2 解决 次日凌晨(7-8 ...

  4. 初试docker以及搭建mysql on docker

    前一阵阅读了google的borg论文,在最后的related works和总结中发现了kubernetes.从论文中了解的kubernetes这个东西很有意思,按照论文所说,它的实现有希望解决an ...

  5. 资源管理与调度系统-YARN资源隔离及以YARN为核心的生态系统

    资源管理与调度系统-YARN资源隔离及以YARN为核心的生态系统 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.什么是资源隔离 资源隔离是指为不同任务提供可独立使用的计算资源以 ...

  6. 使用 Yarn workspace,TypeScript,esbuild,React 和 Express 构建 K8S 云原生应用(一)

    本文将指导您使用 K8S ,Docker,Yarn workspace ,TypeScript,esbuild,Express 和 React 来设置构建一个基本的云原生 Web 应用程序. 在本教程 ...

  7. 使用 Docker 部署 Node 应用

    容器将应用与环境打包整合,解决了应用外部依赖的痛点,打包后通过窗口可方便地部署到任意环境,用过就知道很香. 创建示例应用 以 NestJS 为例,先创建一个示例应用. $ npm i -g @nest ...

  8. vue+ typescript 使用parcel 构建

    parcel 是一个零配置的前端构建工具,相比webpack 更快,同时使用简单以下是 一个简单的使用typescript 开发vue 应用,同时使用parcel 构建,同时集成了docker 构建, ...

  9. 企业实践 | 如何更好地使用 Apache Flink 解决数据计算问题?

    业务数据的指数级扩张,数据处理的速度可不能跟不上业务发展的步伐.基于 Flink 的数据平台构建.运用 Flink 解决业务场景中的具体问题等随着 Flink 被更广泛的应用于广告.金融风控.实时 B ...

随机推荐

  1. android md5加密与rsa加解密实现代码

    import java.io.UnsupportedEncodingException;import java.security.MessageDigest;import java.security. ...

  2. 面试题目——《CC150》Java

    package cc150.java; import java.util.Iterator; public class CircularArray { public static void main( ...

  3. WebPack常用功能介绍

    概述 Webpack是一款用户打包前端模块的工具.主要是用来打包在浏览器端使用的javascript的.同时也能转换.捆绑.打包其他的静态资源,包括css.image.font file.templa ...

  4. php中session锁--如何防止阻塞请求(译)

    现代浏览器限制到一个host并发连接的数量一般为4或6.这意味着,如果您的web页面加载几十个来自同一个host的assert file(js.图像.css)时,由于并发数的限制,会产生排队.同样甚至 ...

  5. 深入理解javascript原型和闭包(5)——instanceof

    又介绍一个老朋友——instanceof. 对于值类型,你可以通过typeof判断,string/number/boolean都很清楚,但是typeof在判断到引用类型的时候,返回值只有object/ ...

  6. 切换数据库+ThreadLocal+AbstractRoutingDataSource 一

    最近项目用的数据库要整合成一个,所以把多源数据库切换的写法要清除掉.所以以下记载了多远数据库切换的用法及个人对源码的理解. 框架:Spring+mybatis+vertx,(多源数据库切换的用法不涉及 ...

  7. itrator控制迭代次数

    <s:iterator value="diys" status="d" begin="0" end="10" st ...

  8. mysql遇到锁表常用命令

    出现 waiting for table metadata lock 锁表的解决方法 1. show processlist; kill xxx; //xxx 为会话id 2.查询是否有未提交的事物 ...

  9. 逻辑回归LR

    逻辑回归算法相信很多人都很熟悉,也算是我比较熟悉的算法之一了,毕业论文当时的项目就是用的这个算法.这个算法可能不想随机森林.SVM.神经网络.GBDT等分类算法那么复杂那么高深的样子,可是绝对不能小看 ...

  10. jquery和dom之间的转换

    刚开始学习jquery,可能一时会分不清楚哪些是jQuery对象,哪些是DOM对象.至于DOM对象不多解释,我们接触的太多了,下面重点介绍一下jQuery,以及两者相互间的转换. 什么是jQuery对 ...