Flink - Scheduler
Job资源分配的过程,
在submitJob中,会生成ExecutionGraph
最终调用到,
executionGraph.scheduleForExecution(scheduler)
接着,ExecutionGraph
public void scheduleForExecution(SlotProvider slotProvider) throws JobException {
// simply take the vertices without inputs.
for (ExecutionJobVertex ejv : this.tasks.values()) {
if (ejv.getJobVertex().isInputVertex()) {
ejv.scheduleAll(slotProvider, allowQueuedScheduling);
}
}
然后,ExecutionJobVertex
public void scheduleAll(SlotProvider slotProvider, boolean queued) throws NoResourceAvailableException {
ExecutionVertex[] vertices = this.taskVertices;
// kick off the tasks
for (ExecutionVertex ev : vertices) {
ev.scheduleForExecution(slotProvider, queued);
}
}
再,ExecutionVertex
public boolean scheduleForExecution(SlotProvider slotProvider, boolean queued) throws NoResourceAvailableException {
return this.currentExecution.scheduleForExecution(slotProvider, queued);
}
最终,Execution
public boolean scheduleForExecution(SlotProvider slotProvider, boolean queued) throws NoResourceAvailableException {
final SlotSharingGroup sharingGroup = vertex.getJobVertex().getSlotSharingGroup();
final CoLocationConstraint locationConstraint = vertex.getLocationConstraint();
if (transitionState(CREATED, SCHEDULED)) {
ScheduledUnit toSchedule = locationConstraint == null ?
new ScheduledUnit(this, sharingGroup) :
new ScheduledUnit(this, sharingGroup, locationConstraint);
// IMPORTANT: To prevent leaks of cluster resources, we need to make sure that slots are returned
// in all cases where the deployment failed. we use many try {} finally {} clauses to assure that
final Future<SimpleSlot> slotAllocationFuture = slotProvider.allocateSlot(toSchedule, queued); //异步去申请资源
// IMPORTANT: We have to use the synchronous handle operation (direct executor) here so
// that we directly deploy the tasks if the slot allocation future is completed. This is
// necessary for immediate deployment.
final Future<Void> deploymentFuture = slotAllocationFuture.handle(new BiFunction<SimpleSlot, Throwable, Void>() {
@Override
public Void apply(SimpleSlot simpleSlot, Throwable throwable) {
if (simpleSlot != null) {
try {
deployToSlot(simpleSlot); //如果申请到,去部署
} catch (Throwable t) {
try {
simpleSlot.releaseSlot();
} finally {
markFailed(t);
}
}
}
else {
markFailed(throwable);
}
return null;
}
});
return true;
}
调用到,slotProvider.allocateSlot, slotProvider即Scheduler
@Override
public Future<SimpleSlot> allocateSlot(ScheduledUnit task, boolean allowQueued)
throws NoResourceAvailableException { final Object ret = scheduleTask(task, allowQueued);
if (ret instanceof SimpleSlot) {
return FlinkCompletableFuture.completed((SimpleSlot) ret); //如果是SimpleSlot,即已经分配成功,表示future结束
}
else if (ret instanceof Future) {
return (Future) ret; //Future说明没有足够资源,申请还在异步中,继续future
}
else {
throw new RuntimeException();
}
}
scheduleTask
/**
* Returns either a {@link SimpleSlot}, or a {@link Future}.
*/
private Object scheduleTask(ScheduledUnit task, boolean queueIfNoResource) throws NoResourceAvailableException { final ExecutionVertex vertex = task.getTaskToExecute().getVertex(); final Iterable<TaskManagerLocation> preferredLocations = vertex.getPreferredLocations();
final boolean forceExternalLocation = vertex.isScheduleLocalOnly() &&
preferredLocations != null && preferredLocations.iterator().hasNext(); //如果preferredLocations不为空,且vertex仅能local schedule synchronized (globalLock) { //全局锁 SlotSharingGroup sharingUnit = task.getSlotSharingGroup(); if (sharingUnit != null) { //如果有SlotSharingGroup // 1) === If the task has a slot sharing group, schedule with shared slots === final SlotSharingGroupAssignment assignment = sharingUnit.getTaskAssignment();
final CoLocationConstraint constraint = task.getLocationConstraint(); // get a slot from the group, if the group has one for us (and can fulfill the constraint)
final SimpleSlot slotFromGroup;
if (constraint == null) {
slotFromGroup = assignment.getSlotForTask(vertex); //通过SlotSharingGroupAssignment来分配slot
}
else {
slotFromGroup = assignment.getSlotForTask(vertex, constraint);
} SimpleSlot newSlot = null;
SimpleSlot toUse = null; // the following needs to make sure any allocated slot is released in case of an error
try { // check whether the slot from the group is already what we want.
// any slot that is local, or where the assignment was unconstrained is good!
if (slotFromGroup != null && slotFromGroup.getLocality() != Locality.NON_LOCAL) { //如果找到local slot
updateLocalityCounters(slotFromGroup, vertex);
return slotFromGroup; //已经找到合适的slot,返回
} // the group did not have a local slot for us. see if we can one (or a better one)
// our location preference is either determined by the location constraint, or by the
// vertex's preferred locations
final Iterable<TaskManagerLocation> locations;
final boolean localOnly;
if (constraint != null && constraint.isAssigned()) { //如果有constraint
locations = Collections.singleton(constraint.getLocation());
localOnly = true;
}
else {
locations = vertex.getPreferredLocationsBasedOnInputs(); //否则,以输入节点所分配的slot的location信息,作为Preferred
localOnly = forceExternalLocation;
}
// the group did not have a local slot for us. see if we can one (or a better one)
newSlot = getNewSlotForSharingGroup(vertex, locations, assignment, constraint, localOnly); //试图为SharingGroup申请一个新的slot if (slotFromGroup == null || !slotFromGroup.isAlive() || newSlot.getLocality() == Locality.LOCAL) {//如果newSlot是local的,那么就是使用newSlot
// if there is no slot from the group, or the new slot is local,
// then we use the new slot
if (slotFromGroup != null) {
slotFromGroup.releaseSlot();
}
toUse = newSlot; //使用新new的slot
}
else {
// both are available and usable. neither is local. in that case, we may
// as well use the slot from the sharing group, to minimize the number of
// instances that the job occupies
newSlot.releaseSlot();
toUse = slotFromGroup;
} // if this is the first slot for the co-location constraint, we lock
// the location, because we are going to use that slot
if (constraint != null && !constraint.isAssigned()) {
constraint.lockLocation();
} updateLocalityCounters(toUse, vertex);
} return toUse; //返回申请的slot
}
else { //如果不是共享slot,比较简单 // 2) === schedule without hints and sharing === SimpleSlot slot = getFreeSlotForTask(vertex, preferredLocations, forceExternalLocation); //直接申请slot
if (slot != null) {
updateLocalityCounters(slot, vertex);
return slot; //申请到了就返回slot
}
else {
// no resource available now, so queue the request
if (queueIfNoResource) { //如果可以queue
CompletableFuture<SimpleSlot> future = new FlinkCompletableFuture<>();
this.taskQueue.add(new QueuedTask(task, future)); //把task缓存起来,并把future对象返回,表示异步申请
return future;
}
}
}
}
}
如果有SlotSharingGroup
首先试图从SlotSharingGroupAssignment中分配slot
slotFromGroup = assignment.getSlotForTask(vertex), 参考,Flink – SlotSharingGroup
如果没有发现local的slot,试图为该vertex创建一个新的slot,
newSlot = getNewSlotForSharingGroup(vertex, locations, assignment, constraint, localOnly); //试图为SharingGroup申请一个新的slot
protected SimpleSlot getNewSlotForSharingGroup(ExecutionVertex vertex,
Iterable<TaskManagerLocation> requestedLocations,
SlotSharingGroupAssignment groupAssignment,
CoLocationConstraint constraint,
boolean localOnly)
{
// we need potentially to loop multiple times, because there may be false positives
// in the set-with-available-instances
while (true) {
Pair<Instance, Locality> instanceLocalityPair = findInstance(requestedLocations, localOnly); //根据locations信息找到local的instance if (instanceLocalityPair == null) { //如果没有可用的instance,返回null
// nothing is available
return null;
} final Instance instanceToUse = instanceLocalityPair.getLeft();
final Locality locality = instanceLocalityPair.getRight(); try {
JobVertexID groupID = vertex.getJobvertexId(); // allocate a shared slot from the instance
SharedSlot sharedSlot = instanceToUse.allocateSharedSlot(vertex.getJobId(), groupAssignment); //从instance申请一个SharedSlot // if the instance has further available slots, re-add it to the set of available resources.
if (instanceToUse.hasResourcesAvailable()) { //如果这个instance还有多余的资源,再加入instancesWithAvailableResources,下次还能继续用来分配
this.instancesWithAvailableResources.put(instanceToUse.getTaskManagerID(), instanceToUse);
} if (sharedSlot != null) {
// add the shared slot to the assignment group and allocate a sub-slot
SimpleSlot slot = constraint == null ?
groupAssignment.addSharedSlotAndAllocateSubSlot(sharedSlot, locality, groupID) : //把分配的SharedSlot加到SlotSharingGroup的SlotSharingGroupAssignment中,并返回SharedSlot所持有的slot
groupAssignment.addSharedSlotAndAllocateSubSlot(sharedSlot, locality, constraint); if (slot != null) {
return slot;
}
else {
// could not add and allocate the sub-slot, so release shared slot
sharedSlot.releaseSlot();
}
}
}
catch (InstanceDiedException e) {
// the instance died it has not yet been propagated to this scheduler
// remove the instance from the set of available instances
removeInstance(instanceToUse);
} // if we failed to get a slot, fall through the loop
}
}
findInstance
private Pair<Instance, Locality> findInstance(Iterable<TaskManagerLocation> requestedLocations, boolean localOnly) {
// drain the queue of newly available instances
while (this.newlyAvailableInstances.size() > 0) { //BlockingQueue<Instance> newlyAvailableInstances
Instance queuedInstance = this.newlyAvailableInstances.poll();
if (queuedInstance != null) {
this.instancesWithAvailableResources.put(queuedInstance.getTaskManagerID(), queuedInstance); // Map<ResourceID, Instance> instancesWithAvailableResources
}
}
// if nothing is available at all, return null
if (this.instancesWithAvailableResources.isEmpty()) {
return null;
}
Iterator<TaskManagerLocation> locations = requestedLocations == null ? null : requestedLocations.iterator();
if (locations != null && locations.hasNext()) { //如果有prefered locations,优先找相对应的Instance
// we have a locality preference
while (locations.hasNext()) {
TaskManagerLocation location = locations.next();
if (location != null) {
Instance instance = instancesWithAvailableResources.remove(location.getResourceID()); //找到对应于perfer location的Instance
if (instance != null) {
return new ImmutablePair<Instance, Locality>(instance, Locality.LOCAL);
}
}
}
// no local instance available
if (localOnly) { //如果localOnly,而前面又没有找到local的,所以只能返回null
return null;
}
else {
// take the first instance from the instances with resources
Iterator<Instance> instances = instancesWithAvailableResources.values().iterator();
Instance instanceToUse = instances.next();
instances.remove();
return new ImmutablePair<>(instanceToUse, Locality.NON_LOCAL); //由于前面没有找到local的,所以返回第一个instance,locality为non_local
}
}
else {
// no location preference, so use some instance
Iterator<Instance> instances = instancesWithAvailableResources.values().iterator();
Instance instanceToUse = instances.next();
instances.remove();
return new ImmutablePair<>(instanceToUse, Locality.UNCONSTRAINED); //没有约束,也是取第一个instance,locality为UNCONSTRAINED
}
}
Instance.allocateSharedSlot
public SharedSlot allocateSharedSlot(JobID jobID, SlotSharingGroupAssignment sharingGroupAssignment)
throws InstanceDiedException
{
synchronized (instanceLock) {
if (isDead) {
throw new InstanceDiedException(this);
} Integer nextSlot = availableSlots.poll(); //Queue<Integer> availableSlots;
if (nextSlot == null) {
return null;
}
else {
SharedSlot slot = new SharedSlot(
jobID, this, location, nextSlot, taskManagerGateway, sharingGroupAssignment);
allocatedSlots.add(slot); //Set<Slot> allocatedSlots
return slot;
}
}
}
如果新分配的slot是local的,就用newSlot;如果不是并且当前SlotSharingGroup中是有non-local的slot,就用现成的slot,没必要使用新的slot,这时需要把newSlot释放掉
如果没有SlotSharingGroup
简单的调用
SimpleSlot slot = getFreeSlotForTask(vertex, preferredLocations, forceExternalLocation);
protected SimpleSlot getFreeSlotForTask(ExecutionVertex vertex,
Iterable<TaskManagerLocation> requestedLocations,
boolean localOnly) {
// we need potentially to loop multiple times, because there may be false positives
// in the set-with-available-instances
while (true) {
Pair<Instance, Locality> instanceLocalityPair = findInstance(requestedLocations, localOnly); //找到一个合适的instance if (instanceLocalityPair == null){
return null;
} Instance instanceToUse = instanceLocalityPair.getLeft();
Locality locality = instanceLocalityPair.getRight(); try {
SimpleSlot slot = instanceToUse.allocateSimpleSlot(vertex.getJobId()); //分配一个simpleSlot // if the instance has further available slots, re-add it to the set of available resources.
if (instanceToUse.hasResourcesAvailable()) {
this.instancesWithAvailableResources.put(instanceToUse.getTaskManagerID(), instanceToUse);
} if (slot != null) {
slot.setLocality(locality);
return slot;
}
}
catch (InstanceDiedException e) {
// the instance died it has not yet been propagated to this scheduler
// remove the instance from the set of available instances
removeInstance(instanceToUse);
} // if we failed to get a slot, fall through the loop
}
}
逻辑和分配SharedSlot基本相同,只是会调用,
public SimpleSlot allocateSimpleSlot(JobID jobID) throws InstanceDiedException {
if (jobID == null) {
throw new IllegalArgumentException();
}
synchronized (instanceLock) {
if (isDead) {
throw new InstanceDiedException(this);
}
Integer nextSlot = availableSlots.poll();
if (nextSlot == null) {
return null;
}
else {
SimpleSlot slot = new SimpleSlot(jobID, this, location, nextSlot, taskManagerGateway);
allocatedSlots.add(slot);
return slot;
}
}
}
Instance
Scheduler中的Instance怎么来的?
Scheduler实现InstanceListener接口的
newInstanceAvailable
@Override
public void newInstanceAvailable(Instance instance) { // synchronize globally for instance changes
synchronized (this.globalLock) { // check we do not already use this instance
if (!this.allInstances.add(instance)) { //看看是否已经有了这个instance
throw new IllegalArgumentException("The instance is already contained.");
} try {
// make sure we get notifications about slots becoming available
instance.setSlotAvailabilityListener(this); //加上SlotAvailabilityListener,当slot ready的时候,可以被通知 // store the instance in the by-host-lookup
String instanceHostName = instance.getTaskManagerLocation().getHostname();
Set<Instance> instanceSet = allInstancesByHost.get(instanceHostName); // HashMap<String, Set<Instance>> allInstancesByHost
if (instanceSet == null) {
instanceSet = new HashSet<Instance>();
allInstancesByHost.put(instanceHostName, instanceSet);
}
instanceSet.add(instance); // add it to the available resources and let potential waiters know
this.instancesWithAvailableResources.put(instance.getTaskManagerID(), instance); // Map<ResourceID, Instance> instancesWithAvailableResources // add all slots as available
for (int i = 0; i < instance.getNumberOfAvailableSlots(); i++) { //多次触发newSlotAvailable
newSlotAvailable(instance);
}
}
catch (Throwable t) {
LOG.error("Scheduler could not add new instance " + instance, t);
removeInstance(instance);
}
}
}
newInstanceAvailable,何时被调用,
JobManager
case msg @ RegisterTaskManager(
resourceId,
connectionInfo,
hardwareInformation,
numberOfSlots) => val instanceID = instanceManager.registerTaskManager(
taskManagerGateway,
connectionInfo,
hardwareInformation,
numberOfSlots)
InstanceManager
public InstanceID registerTaskManager(
TaskManagerGateway taskManagerGateway,
TaskManagerLocation taskManagerLocation,
HardwareDescription resources,
int numberOfSlots) { synchronized (this.lock) { InstanceID instanceID = new InstanceID(); Instance host = new Instance(
taskManagerGateway,
taskManagerLocation,
instanceID,
resources,
numberOfSlots); // notify all listeners (for example the scheduler)
notifyNewInstance(host); return instanceID;
}
}
private void notifyNewInstance(Instance instance) {
synchronized (this.instanceListeners) {
for (InstanceListener listener : this.instanceListeners) {
try {
listener.newInstanceAvailable(instance);
}
catch (Throwable t) {
LOG.error("Notification of new instance availability failed.", t);
}
}
}
}
Scheduler还是实现SlotAvailabilityListener
会调用newSlotAvailable
逻辑只是check是否有待分配的task,当有新的slot ready的时候,把queuedTask的future complete掉
@Override
public void newSlotAvailable(final Instance instance) { // WARNING: The asynchrony here is necessary, because we cannot guarantee the order
// of lock acquisition (global scheduler, instance) and otherwise lead to potential deadlocks:
//
// -> The scheduler needs to grab them (1) global scheduler lock
// (2) slot/instance lock
// -> The slot releasing grabs (1) slot/instance (for releasing) and
// (2) scheduler (to check whether to take a new task item
//
// that leads with a high probability to deadlocks, when scheduling fast this.newlyAvailableInstances.add(instance); Futures.future(new Callable<Object>() {
@Override
public Object call() throws Exception {
handleNewSlot();
return null;
}
}, executionContext);
} private void handleNewSlot() { synchronized (globalLock) {
Instance instance = this.newlyAvailableInstances.poll();
if (instance == null || !instance.hasResourcesAvailable()) {
// someone else took it
return;
} QueuedTask queued = taskQueue.peek(); //如果有待分配的task // the slot was properly released, we can allocate a new one from that instance if (queued != null) {
ScheduledUnit task = queued.getTask();
ExecutionVertex vertex = task.getTaskToExecute().getVertex(); try {
SimpleSlot newSlot = instance.allocateSimpleSlot(vertex.getJobId()); //从instance分配一个simpleSlot
if (newSlot != null) { // success, remove from the task queue and notify the future
taskQueue.poll();
if (queued.getFuture() != null) {
try {
queued.getFuture().complete(newSlot); //complete该task的future,有slot了,你不用继续等了
}
catch (Throwable t) {
LOG.error("Error calling allocation future for task " + vertex.getSimpleName(), t);
task.getTaskToExecute().fail(t);
}
}
}
}
catch (InstanceDiedException e) {
if (LOG.isDebugEnabled()) {
LOG.debug("Instance " + instance + " was marked dead asynchronously.");
} removeInstance(instance);
}
}
else { //如果没有排队的task,直接把instance放到instancesWithAvailableResources就好
this.instancesWithAvailableResources.put(instance.getTaskManagerID(), instance);
}
}
}
newSlotAvailable除了当new instance注册时被调用外,还会在Instance.returnAllocatedSlot,即有人释放AllocatedSlot时,会被调用
Flink - Scheduler的更多相关文章
- Flink – JobManager.submitJob
JobManager作为actor, case SubmitJob(jobGraph, listeningBehaviour) => val client = sender() val jobI ...
- AndroidStudio3.0无法打开Android Device Monitor的解决办法(An error has occurred on Android Device Monitor)
---恢复内容开始--- 打开monitor时出现 An error has occurred. See the log file... ------------------------------- ...
- Flink 1.1 – ResourceManager
Flink resource manager的作用如图, FlinkResourceManager /** * * <h1>Worker allocation steps</h1 ...
- Flink - InstanceManager
InstanceManager用于管理JobManager申请到的taskManager和slots资源 /** * Simple manager that keeps track of which ...
- Flink - Checkpoint
Flink在流上最大的特点,就是引入全局snapshot, CheckpointCoordinator 做snapshot的核心组件为, CheckpointCoordinator /** * T ...
- Flink - FlinkKafkaConsumer08
先看 AbstractFetcher 这个可以理解就是,consumer中具体去kafka读数据的线程,一个fetcher可以同时读多个partitions的数据来看看 /** * Base cl ...
- 使用Flink时遇到的问题(不断更新中)
1.启动不起来 查看JobManager日志: WARN org.apache.flink.runtime.webmonitor.JobManagerRetriever - Failed to ret ...
- Apache Flink 分布式执行
Flink 的分布式执行过程包含两个重要的角色,master 和 worker,参与 Flink 程序执行的有多个进程,包括 Job Manager,Task Manager 以及 Job Clien ...
- Hadoop Compatibility in Flink
18 Nov 2014 by Fabian Hüske (@fhueske) Apache Hadoop is an industry standard for scalable analytical ...
随机推荐
- [转](OS 10038)在一个非套接字上尝试了一个操作 的解决办法
原文: http://blog.csdn.net/zooop/article/details/47170885 可能是安装了某些程序修改了Winsock,使用netsh winsock reset 命 ...
- 利用堆实现堆排序&优先队列
数据结构之(二叉)堆一文在末尾提到"利用堆能够实现:堆排序.优先队列.".本文代码实现之. 1.堆排序 如果要实现非递减排序.则须要用要大顶堆. 此处设计到三个大顶堆的操作:(1) ...
- Mac 抓包工具wireshark使用
共四部分 1.wireshark简介 2.wireshark mac版安装 3.wireshark 抓取普通http 4.高级应用 1.wireshark 简介(百度百科) Wireshark(前称E ...
- 安卓程序代写 网上程序代写[原]Android中的回调Callback
回调就是外部设置一个方法给一个对象, 这个对象可以执行外部设置的方法, 通常这个方法是定义在接口中的抽象方法, 外部设置的时候直接设置这个接口对象即可. 1. 如何定义一个回调 a. 定义接口 : 在 ...
- JAVA的各种O
转自:http://jeoff.blog.51cto.com/186264/88517/ J2EE开发中大量的专业缩略语很是让人迷惑, 特别是对于刚毕业的新人来说更是摸不清头脑.若与公司大牛谈技术人家 ...
- How to get all Errors from ASP.Net MVC modelState?
foreach (ModelState modelState in ViewData.ModelState.Values) { foreach (ModelError error in modelSt ...
- 【GIS】Vue、Leaflet、highlightmarker、bouncemarker
感谢: https://github.com/brandonxiang/leaflet.marker.highlight https://github.com/maximeh/leaflet.boun ...
- 【代码审计】五指CMS_v4.1.0 后台存在SQL注入漏洞分析
0x00 环境准备 五指CMS官网:https://www.wuzhicms.com/ 网站源码版本:五指CMS v4.1.0 UTF-8 开源版 程序源码下载:https://www.wuzhi ...
- Mac和Windows中常见中文字体的英文名称
我们在给HTML元素设置字体的时候经常会有类似 “ font-family:"微软雅黑", "黑体" ” 这样的写法,尤其是在使用Dreamweaver.Apt ...
- JS 对象的深拷贝和浅拷贝
转载于原文:https://www.cnblogs.com/dabingqi/p/8502932.html 这篇文章是转载于上面的链接地址,觉得写的非常好,所以收藏了,感谢原创作者的分享. 浅拷贝和深 ...