客户端编程库:
所在jar包: org.apache.hadoop.yarn.client.YarnClient 使用方法:
1 定义一个YarnClient实例:
private YarnClient client;
2 构造一个Yarn客户端句柄并初始化
this.client = YarnClient.createYarnClient();
client.ini(conf)
3 启动Yarn
yarnClient.start()
4 获取一个新的application id
YarnClientApplication app=yarnClient.createApplication(); 注解:application id 封装在YarnCLientApplication里面了。
5 构造ApplicationSubmissionContext, 用以提交作业
ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext();
###################注解:一下会有很多set 属性的东东“程序名称,优先级,所在的队列啊,###################################
ApplicationId appId = appContext.getApplicationId();
appContext.setApplicationName(appName)
......
yarnClient.submitApplication(appContext);//将应用程序提交到ResouceManager上 这里通过步骤 2 中的conf 读取yarn-site.xml


ApplicationMaster编程酷

设计思路:
为用户暴露‘回调函数’用户只需要实现这些回调函数,当某种事情发生时,会调用对应的(用户实现的)回调函数
回调机制:
1 定义一个回调函数
2 提供函数实现的一方在初始化的时候,将回调函数的函数指针注册给调用者
3 当特殊条件或事件发生的时候,调用者使用函数指针调用回调函数对事件进行处理
回调机制好处:
可以把调用者和被调用者分开,调用者不关心谁是调用者。它只需知道存在一个具有特定原型和限制条件的被调用函数。


/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/ package org.apache.hadoop.yarn.client.api.async; import java.io.IOException;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger; import org.apache.hadoop.classification.InterfaceAudience.Private;
import org.apache.hadoop.classification.InterfaceAudience.Public;
import org.apache.hadoop.classification.InterfaceStability.Stable;
import org.apache.hadoop.service.AbstractService;
import org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterResponse;
import org.apache.hadoop.yarn.api.records.Container;
import org.apache.hadoop.yarn.api.records.ContainerId;
import org.apache.hadoop.yarn.api.records.ContainerStatus;
import org.apache.hadoop.yarn.api.records.FinalApplicationStatus;
import org.apache.hadoop.yarn.api.records.NodeReport;
import org.apache.hadoop.yarn.api.records.Priority;
import org.apache.hadoop.yarn.api.records.Resource;
import org.apache.hadoop.yarn.client.api.AMRMClient;
import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest;
import org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl;
import org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl;
import org.apache.hadoop.yarn.exceptions.YarnException; import com.google.common.annotations.VisibleForTesting; /**
* <code>AMRMClientAsync</code> handles communication with the ResourceManager
* and provides asynchronous updates on events such as container allocations and
* completions. It contains a thread that sends periodic heartbeats to the
* ResourceManager.
*
* It should be used by implementing a CallbackHandler:
* <pre>
* {@code
* class MyCallbackHandler implements AMRMClientAsync.CallbackHandler {
* public void onContainersAllocated(List<Container> containers) {
* [run tasks on the containers]
* }
*
* public void onContainersCompleted(List<ContainerStatus> statuses) {
* [update progress, check whether app is done]
* }
*
* public void onNodesUpdated(List<NodeReport> updated) {}
*
* public void onReboot() {}
* }
* }
* </pre>
*
* The client's lifecycle should be managed similarly to the following:
*
* <pre>
* {@code
* AMRMClientAsync asyncClient =
* createAMRMClientAsync(appAttId, 1000, new MyCallbackhandler());
* asyncClient.init(conf);
* asyncClient.start();
* RegisterApplicationMasterResponse response = asyncClient
* .registerApplicationMaster(appMasterHostname, appMasterRpcPort,
* appMasterTrackingUrl);
* asyncClient.addContainerRequest(containerRequest);
* [... wait for application to complete]
* asyncClient.unregisterApplicationMaster(status, appMsg, trackingUrl);
* asyncClient.stop();
* }
* </pre>
*/
@Public
@Stable
public abstract class AMRMClientAsync<T extends ContainerRequest>
extends AbstractService { protected final AMRMClient<T> client;
protected final CallbackHandler handler;
protected final AtomicInteger heartbeatIntervalMs = new AtomicInteger(); public static <T extends ContainerRequest> AMRMClientAsync<T>
createAMRMClientAsync(int intervalMs, CallbackHandler callbackHandler) {
return new AMRMClientAsyncImpl<T>(intervalMs, callbackHandler);
} public static <T extends ContainerRequest> AMRMClientAsync<T>
createAMRMClientAsync(AMRMClient<T> client, int intervalMs,
CallbackHandler callbackHandler) {
return new AMRMClientAsyncImpl<T>(client, intervalMs, callbackHandler);
} protected AMRMClientAsync(int intervalMs, CallbackHandler callbackHandler) {
this(new AMRMClientImpl<T>(), intervalMs, callbackHandler);
} @Private
@VisibleForTesting
protected AMRMClientAsync(AMRMClient<T> client, int intervalMs,
CallbackHandler callbackHandler) {
super(AMRMClientAsync.class.getName());
this.client = client;
this.heartbeatIntervalMs.set(intervalMs);
this.handler = callbackHandler;
} public void setHeartbeatInterval(int interval) {
heartbeatIntervalMs.set(interval);
} public abstract List<? extends Collection<T>> getMatchingRequests(
Priority priority,
String resourceName,
Resource capability); /**
* Registers this application master with the resource manager. On successful
* registration, starts the heartbeating thread.
* @throws YarnException
* @throws IOException
*/
public abstract RegisterApplicationMasterResponse registerApplicationMaster(
String appHostName, int appHostPort, String appTrackingUrl)
throws YarnException, IOException; /**
* Unregister the application master. This must be called in the end.
* @param appStatus Success/Failure status of the master
* @param appMessage Diagnostics message on failure
* @param appTrackingUrl New URL to get master info
* @throws YarnException
* @throws IOException
*/
public abstract void unregisterApplicationMaster(
FinalApplicationStatus appStatus, String appMessage, String appTrackingUrl)
throws YarnException, IOException; /**
* Request containers for resources before calling <code>allocate</code>
* @param req Resource request
*/
public abstract void addContainerRequest(T req); /**
* Remove previous container request. The previous container request may have
* already been sent to the ResourceManager. So even after the remove request
* the app must be prepared to receive an allocation for the previous request
* even after the remove request
* @param req Resource request
*/
public abstract void removeContainerRequest(T req); /**
* Release containers assigned by the Resource Manager. If the app cannot use
* the container or wants to give up the container then it can release them.
* The app needs to make new requests for the released resource capability if
* it still needs it. eg. it released non-local resources
* @param containerId
*/
public abstract void releaseAssignedContainer(ContainerId containerId); /**
* Get the currently available resources in the cluster.
* A valid value is available after a call to allocate has been made
* @return Currently available resources
*/
public abstract Resource getAvailableResources(); /**
* Get the current number of nodes in the cluster.
* A valid values is available after a call to allocate has been made
* @return Current number of nodes in the cluster
*/
public abstract int getClusterNodeCount(); public interface CallbackHandler { 注视:这里有一系列的回调函数 /**
* Called when the ResourceManager responds to a heartbeat with completed
* containers. If the response contains both completed containers and
* allocated containers, this will be called before containersAllocated.
*/
public void onContainersCompleted(List<ContainerStatus> statuses); /**
* Called when the ResourceManager responds to a heartbeat with allocated
* containers. If the response containers both completed containers and
* allocated containers, this will be called after containersCompleted.
*/
public void onContainersAllocated(List<Container> containers); /**
* Called when the ResourceManager wants the ApplicationMaster to shutdown
* for being out of sync etc. The ApplicationMaster should not unregister
* with the RM unless the ApplicationMaster wants to be the last attempt.
*/
public void onShutdownRequest(); /**
* Called when nodes tracked by the ResourceManager have changed in health,
* availability etc.
*/
public void onNodesUpdated(List<NodeReport> updatedNodes); public float getProgress(); /**
* Called when error comes from RM communications as well as from errors in
* the callback itself from the app. Calling
* stop() is the recommended action.
*
* @param e
*/
public void onError(Throwable e);
}
}

用户实现一个MyCallbackHandler,实现AMRMClientAsync.CallbackHandler接口:

class MyCallbackHandler implements AMRMClientAsync.CallbackHandler{

.....................

}

ApplicationMaster编程---AM 与 RM交互
引入jar包, org.apche.hadoop.yarn.client.api.async.AMRMClientAsync;
流程:
1 构造一个MyCallbackHandler对象
AMRMClientAsync.CallbackHandler allocListener = new MyCallbackHandler();
2 构造一个AMRMClientAsync句柄
asyncClient = AMRMClientAsync.createAMRMClientAsync(1000, allcoListener);
3 初始化并启动AMRMClientAsync
asyncClient.init(conf);//通过传入一个YarnConfiguration对象并进行初始化
asyncClient.start(); //启动asyncClient 4 ApplicationMaster向ResourceManager注册
RegisterApplicationMasterResponse reponse = asyncClient.registerApplicationMaster(appMasterHostname, appMasterRpcPort, appMasterTrackingUrl);
5 添加Container请求
asyncClient.addContainerRequest(containerRequest)
6 等待应用程序运行结束
asyncClient.unregisterApplicationMaster(status, appMsg, null); [反注册]
asyncCLient.stop()
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/ package org.apache.hadoop.yarn.client.api.async; import java.nio.ByteBuffer;
import java.util.Map;
import java.util.concurrent.ConcurrentMap; import org.apache.hadoop.classification.InterfaceAudience.Private;
import org.apache.hadoop.classification.InterfaceAudience.Public;
import org.apache.hadoop.classification.InterfaceStability.Stable;
import org.apache.hadoop.service.AbstractService;
import org.apache.hadoop.yarn.api.records.Container;
import org.apache.hadoop.yarn.api.records.ContainerId;
import org.apache.hadoop.yarn.api.records.ContainerLaunchContext;
import org.apache.hadoop.yarn.api.records.ContainerStatus;
import org.apache.hadoop.yarn.api.records.NodeId;
import org.apache.hadoop.yarn.api.records.Token;
import org.apache.hadoop.yarn.client.api.NMClient;
import org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl;
import org.apache.hadoop.yarn.client.api.impl.NMClientImpl;
import org.apache.hadoop.yarn.conf.YarnConfiguration; import com.google.common.annotations.VisibleForTesting; /**
* <code>NMClientAsync</code> handles communication with all the NodeManagers
* and provides asynchronous updates on getting responses from them. It
* maintains a thread pool to communicate with individual NMs where a number of
* worker threads process requests to NMs by using {@link NMClientImpl}. The max
* size of the thread pool is configurable through
* {@link YarnConfiguration#NM_CLIENT_ASYNC_THREAD_POOL_MAX_SIZE}.
*
* It should be used in conjunction with a CallbackHandler. For example
*
* <pre>
* {@code
* class MyCallbackHandler implements NMClientAsync.CallbackHandler {
* public void onContainerStarted(ContainerId containerId,
* Map<String, ByteBuffer> allServiceResponse) {
* [post process after the container is started, process the response]
* }
*
* public void onContainerStatusReceived(ContainerId containerId,
* ContainerStatus containerStatus) {
* [make use of the status of the container]
* }
*
* public void onContainerStopped(ContainerId containerId) {
* [post process after the container is stopped]
* }
*
* public void onStartContainerError(
* ContainerId containerId, Throwable t) {
* [handle the raised exception]
* }
*
* public void onGetContainerStatusError(
* ContainerId containerId, Throwable t) {
* [handle the raised exception]
* }
*
* public void onStopContainerError(
* ContainerId containerId, Throwable t) {
* [handle the raised exception]
* }
* }
* }
* </pre>
*
* The client's life-cycle should be managed like the following:
*
* <pre>
* {@code
* NMClientAsync asyncClient =
* NMClientAsync.createNMClientAsync(new MyCallbackhandler());
* asyncClient.init(conf);
* asyncClient.start();
* asyncClient.startContainer(container, containerLaunchContext);
* [... wait for container being started]
* asyncClient.getContainerStatus(container.getId(), container.getNodeId(),
* container.getContainerToken());
* [... handle the status in the callback instance]
* asyncClient.stopContainer(container.getId(), container.getNodeId(),
* container.getContainerToken());
* [... wait for container being stopped]
* asyncClient.stop();
* }
* </pre>
*/
@Public
@Stable
public abstract class NMClientAsync extends AbstractService { protected NMClient client;
protected CallbackHandler callbackHandler; public static NMClientAsync createNMClientAsync(
CallbackHandler callbackHandler) {
return new NMClientAsyncImpl(callbackHandler);
} protected NMClientAsync(CallbackHandler callbackHandler) {
this (NMClientAsync.class.getName(), callbackHandler);
} protected NMClientAsync(String name, CallbackHandler callbackHandler) {
this (name, new NMClientImpl(), callbackHandler);
} @Private
@VisibleForTesting
protected NMClientAsync(String name, NMClient client,
CallbackHandler callbackHandler) {
super(name);
this.setClient(client);
this.setCallbackHandler(callbackHandler);
} public abstract void startContainerAsync(
Container container, ContainerLaunchContext containerLaunchContext); public abstract void stopContainerAsync(
ContainerId containerId, NodeId nodeId); public abstract void getContainerStatusAsync(
ContainerId containerId, NodeId nodeId); public NMClient getClient() {
return client;
} public void setClient(NMClient client) {
this.client = client;
} public CallbackHandler getCallbackHandler() {
return callbackHandler;
} public void setCallbackHandler(CallbackHandler callbackHandler) {
this.callbackHandler = callbackHandler;
} /**
* <p>
* The callback interface needs to be implemented by {@link NMClientAsync}
* users. The APIs are called when responses from <code>NodeManager</code> are
* available.
* </p>
*
* <p>
* Once a callback happens, the users can chose to act on it in blocking or
* non-blocking manner. If the action on callback is done in a blocking
* manner, some of the threads performing requests on NodeManagers may get
* blocked depending on how many threads in the pool are busy.
* </p>
*
* <p>
* The implementation of the callback function should not throw the
* unexpected exception. Otherwise, {@link NMClientAsync} will just
* catch, log and then ignore it.
* </p>
*/
public static interface CallbackHandler {
/**
* The API is called when <code>NodeManager</code> responds to indicate its
* acceptance of the starting container request
* @param containerId the Id of the container
* @param allServiceResponse a Map between the auxiliary service names and
* their outputs
*/
void onContainerStarted(ContainerId containerId,
Map<String, ByteBuffer> allServiceResponse); /**
* The API is called when <code>NodeManager</code> responds with the status
* of the container
* @param containerId the Id of the container
* @param containerStatus the status of the container
*/
void onContainerStatusReceived(ContainerId containerId,
ContainerStatus containerStatus); /**
* The API is called when <code>NodeManager</code> responds to indicate the
* container is stopped.
* @param containerId the Id of the container
*/
void onContainerStopped(ContainerId containerId); /**
* The API is called when an exception is raised in the process of
* starting a container
*
* @param containerId the Id of the container
* @param t the raised exception
*/
void onStartContainerError(ContainerId containerId, Throwable t); /**
* The API is called when an exception is raised in the process of
* querying the status of a container
*
* @param containerId the Id of the container
* @param t the raised exception
*/
void onGetContainerStatusError(ContainerId containerId, Throwable t); /**
* The API is called when an exception is raised in the process of
* stopping a container
*
* @param containerId the Id of the container
* @param t the raised exception
*/
void onStopContainerError(ContainerId containerId, Throwable t); } } ApplicationMaster编程酷,AM与NM交互酷
用户实现一个MyCallbackHandler,实现NMClientAsync.CallbackHandler接口 class MyCallbackHandler implements NMClientAsync.CallbackHandler{
............
} 引入jar包:
org.apache.hadoop.yarn.client.api.async.NMClientAsync;
org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl 流程:
1 构造一个NMClientAsync句柄
NMClientAsync asyncClient = new NMClientAsyncImpl(new MyCallbackhandler())
2 初始化并启动 NMClientAsync
asyncClient.init(conf);//初始化ansyClient
asyncClient.start(); //启动asyncClient
3 构造Container
ContainerLaunchContext ctx = Records.newRecord(ContainerLaunchContext.class);
...//设置ctx变量
4 启动Container
asyncClient.startContainerAsync(container, ctx);
5 获取container状态
asyncClient.getContainerStatusAsync(container.getId(), container.getNodeId(), container.getContainerToken());
6 停止Container
asyncClient.stopContainerAsync(container.getId(), container.getNodeId(), container.getContainerToken());
asyncClient.stop()



hadoop Yarn 编程API的更多相关文章

  1. Hadoop MapReduce编程 API入门系列之挖掘气象数据版本3(九)

    不多说,直接上干货! 下面,是版本1. Hadoop MapReduce编程 API入门系列之挖掘气象数据版本1(一) 下面是版本2. Hadoop MapReduce编程 API入门系列之挖掘气象数 ...

  2. Hadoop MapReduce编程 API入门系列之压缩和计数器(三十)

    不多说,直接上代码. Hadoop MapReduce编程 API入门系列之小文件合并(二十九) 生成的结果,作为输入源. 代码 package zhouls.bigdata.myMapReduce. ...

  3. Hadoop MapReduce编程 API入门系列之挖掘气象数据版本2(十)

    下面,是版本1. Hadoop MapReduce编程 API入门系列之挖掘气象数据版本1(一) 这篇博文,包括了,实际生产开发非常重要的,单元测试和调试代码.这里不多赘述,直接送上代码. MRUni ...

  4. Hadoop Yarn REST API未授权漏洞利用

    Hadoop Yarn REST API未授权漏洞利用 Hadoop是一个由Apache基金会所开发的分布式系统基础架构,YARN是hadoop系统上的资源统一管理平台,其主要作用是实现集群资源的统一 ...

  5. Hadoop Yarn REST API未授权漏洞利用挖矿分析

    欢迎大家前往腾讯云+社区,获取更多腾讯海量技术实践干货哦~ 一.背景情况 5月5日腾讯云安全曾针对攻击者利用Hadoop Yarn资源管理系统REST API未授权漏洞对服务器进行攻击,攻击者可以在未 ...

  6. Hadoop MapReduce编程 API入门系列之网页排序(二十八)

    不多说,直接上代码. Map output bytes=247 Map output materialized bytes=275 Input split bytes=139 Combine inpu ...

  7. Hadoop MapReduce编程 API入门系列之倒排索引(二十四)

    不多说,直接上代码. 2016-12-12 21:54:04,509 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JV ...

  8. Hadoop MapReduce编程 API入门系列之二次排序(十六)

    不多说,直接上代码. -- ::, INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with pr ...

  9. Hadoop MapReduce编程 API入门系列之最短路径(十五)

    不多说,直接上代码. ======================================= Iteration: 1= Input path: out/shortestpath/input. ...

随机推荐

  1. [TypeScript] Installing TypeScript and Running the TypeScript Compiler (tsc)

    This lesson shows you how to install TypeScript and run the TypeScript compiler against a .ts file f ...

  2. 条带深度 队列深度 NCQ IOPS

    http://blog.csdn.net/striping/article/details/17449653 IOPS 即I/O per second,即每秒进行读写(I/O)操作的次数,多用于数据库 ...

  3. MapReduce中的map个数

    在map阶段读取数据前,FileInputFormat会将输入文件分割成split.split的个数决定了map的个数.影响map个数(split个数)的主要因素有: 1) 文件的大小.当块(dfs. ...

  4. gson使用详解

    昨天读一篇文章,看到gson这个词,一开始还以为作者写错了,问了度娘之后才发现是我才疏学浅,于是大概了解了一下gson用法,总体来说还是很简单的. Gson.jar下载 JavaBean转json / ...

  5. 基于bootstrap的datetimepicker插件

    1.当时使用的资源地址:http://www.bootcss.com/p/bootstrap-datetimepicker/ 2.如何让时间只显示到日期,不显示具体时刻 控制显示精度的是datetim ...

  6. c# 值传递 引用传递

    以前一直误以为引用类型,在作为参数传递时,都是引用传递(类似于值传递中的ref),也就是说,把引用类型的变量作为参数传递给方法,在方法中修改该参数,会改变这个变量的值, 后来通过一些事例发现,上面的认 ...

  7. 第三篇:web之前端之JavaScript基础

    前端之JavaScript基础   前端之JavaScript基础 本节内容 JS概述 JS基础语法 JS循环控制 ECMA对象 BOM对象 DOM对象 1. JS概述 1.1. javascript ...

  8. SQL Server 游标

    结果集,结果集就是select查询之后返回的所有行数据的集合. 在关系数据库中,我们对于查询的思考是面向集合的.而游标打破了这一规则,游标使得我们思考方式变为逐行进行. 正常面向集合的思维方式是: 而 ...

  9. [网络] C# NetHelper网络通信编程类教程与源码下载

    点击下载 NetHelper.zip 主要功能如下所示 检查设置的IP地址是否正确,返回正确的IP地址 检查设置的端口号是否正确,返回正确的端口号 将字符串形式的IP地址转换成IPAddress对象 ...

  10. C#常用的关键字

    常用关键字有 this 1)当前类的对象 2)调用自己的构造函数 new base virtual interface abstract override parttial sealed return ...