前言

　　在Flink原理——容错机制一文中，已对checkpoint的机制有了较为基础的介绍，本文着重从源码方面去分析checkpoint的过程。当然本文只是分析做checkpoint的调度过程，只是尽量弄清楚整体的逻辑，没有弄清楚其实现细节，还是有遗憾的，后期还是努力去分析实现细节。文中若是有误，欢迎大伙留言指出！

　　本文基于Flink1.9。

1、参数设置

　　1.1 有关checkpoint常见的参数如下：

 StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

 env.enableCheckpointing(10000);   //默认是不开启的　　

 env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);  //默认为EXACTLY_ONCE

 env.getCheckpointConfig().setMinPauseBetweenCheckpoints(5000);　　//默认为0，最大值为1年

 env.getCheckpointConfig().setCheckpointTimeout(150000);　　//默认为10min

 env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);　　//默认为1

　　上述参数的默认值可见flink-streaming-java*.jar中的CheckpointConfig.java，配置值是通过该类中私有configureCheckpointing()的jobGraph.setSnapshotSettings(settings)传递给runtime层的，更多设置也可以参见该类。

　　1.2 参数分析

　　这里着重分析enableCheckpointing()设置的baseInterval和minPauseBetweenCheckpoint之间的关系。为分析两者的关系，这里先给出源码中定义

     /** The base checkpoint interval. Actual trigger time may be affected by the

     * max concurrent checkpoints and minimum-pause values */

     //checkpoint触发周期，时间触发时间还受maxConcurrentCheckpointAttempts和minPauseBetweenCheckpointsNanos影响

     private final long baseInterval;

     /** The min time(in ns) to delay after a checkpoint could be triggered. Allows to

      * enforce minimum processing time between checkpoint attempts */

     //在可以触发checkpoint的时，两次checkpoint之间的时间间隔

     private final long minPauseBetweenCheckpointsNanos;

　　当baseInterval<minPauseBetweenCheckpoint时，在CheckpointCoordinator.java源码中定义如下：

     // it does not make sense to schedule checkpoints more often then the desired

     // time between checkpoints

     long baseInterval = chkConfig.getCheckpointInterval();

     if (baseInterval < minPauseBetweenCheckpoints) {

         baseInterval = minPauseBetweenCheckpoints;

     }

　　从此可以看出，checkpoint的触发虽然设置为周期性的，但是实际触发情况，还得考虑minPauseBetweenCheckpoint和maxConcurrentCheckpointAttempts，若maxConcurrentCheckpointAttempts为1，就算满足触发时间也需等待正在执行的checkpoint结束。

2、checkpoint调用过程

　　将JobGraph提交到Dispatcher后，会createJobManagerRunner和startJobManagerRunner，可以关注Dispatcher类中的createJobManagerRunner(...)方法。

　　2.1 createJobManagerRunner阶段

　　该阶段会创建一个JobManagerRunner实例，在该过程和checkpoint有关的是会启动listener去监听job的状态。

 　　#JobManagerRunner.java

     public JobManagerRunner(...) throws Exception {

         //..........

         // make sure we cleanly shut down out JobManager services if initialization fails

         try {

             //..........

             //加载JobGraph、library、leader选举等

             // now start the JobManager

             //启动JobManager

             this.jobMasterService = jobMasterFactory.createJobMasterService(jobGraph, this, userCodeLoader);

         }

         catch (Throwable t) {

             //......

         }

     }

     //在DefaultJobMasterServiceFactory类的createJobMasterService()中新建一个JobMaster对象

     //#JobMaster.java

     public JobMaster(...) throws Exception {

         //........

         //该方法中主要做了参数检查，slotPool的创建、slotPool的schedul的创建等一系列的事情

         //创建一个调度器

         this.schedulerNG = createScheduler(jobManagerJobMetricGroup);

         //......

     }

　　在创建调度器中核心的语句如下：

 　　//#LegacyScheduler.java中的LegacyScheduler()

     //创建ExecutionGraph

     this.executionGraph = createAndRestoreExecutionGraph(jobManagerJobMetricGroup, checkNotNull(shuffleMaster), checkNotNull(partitionTracker));

 　　

     private ExecutionGraph createAndRestoreExecutionGraph(

         JobManagerJobMetricGroup currentJobManagerJobMetricGroup,

         ShuffleMaster<?> shuffleMaster,

         PartitionTracker partitionTracker) throws Exception {

         ExecutionGraph newExecutionGraph = createExecutionGraph(currentJobManagerJobMetricGroup, shuffleMaster, partitionTracker);

         final CheckpointCoordinator checkpointCoordinator = newExecutionGraph.getCheckpointCoordinator();

         if (checkpointCoordinator != null) {

             // check whether we find a valid checkpoint

             //若state没有被恢复是否可以通过savepoint恢复

             //......

             }

         }

         return newExecutionGraph;

     }

　　通过调用到达生成ExecutionGraph的核心类ExecutionGraphBuilder的在buildGraph()方法，其中该方法主要是生成ExecutionGraph和设置checkpoint，下面给出其中的核心代码：

 1     //..............

     //生成ExecutionGraph的核心方法，这里后期会详细分析

     executionGraph.attachJobGraph(sortedTopology);

     //.......................

     //在enableCheckpointing中设置CheckpointCoordinator

     executionGraph.enableCheckpointing(

         chkConfig,

         triggerVertices,

         ackVertices,

         confirmVertices,

         hooks,

         checkpointIdCounter,

         completedCheckpoints,

         rootBackend,

         checkpointStatsTracker);

　　在enableCheckpointing()方法中主要是创建了checkpoint失败是的manager、设置了checkpoint的核心类CheckpointCoordinator。

     //#ExecutionGraph.java

     public void enableCheckpointing(

             CheckpointCoordinatorConfiguration chkConfig,

             List<ExecutionJobVertex> verticesToTrigger,

             List<ExecutionJobVertex> verticesToWaitFor,

             List<ExecutionJobVertex> verticesToCommitTo,

             List<MasterTriggerRestoreHook<?>> masterHooks,

             CheckpointIDCounter checkpointIDCounter,

             CompletedCheckpointStore checkpointStore,

             StateBackend checkpointStateBackend,

             CheckpointStatsTracker statsTracker) {

         //Job的状态必须为Created，

         checkState(state == JobStatus.CREATED, "Job must be in CREATED state");

         checkState(checkpointCoordinator == null, "checkpointing already enabled");

         //checkpointing的不同状态

         ExecutionVertex[] tasksToTrigger = collectExecutionVertices(verticesToTrigger);

         ExecutionVertex[] tasksToWaitFor = collectExecutionVertices(verticesToWaitFor);

         ExecutionVertex[] tasksToCommitTo = collectExecutionVertices(verticesToCommitTo);

         checkpointStatsTracker = checkNotNull(statsTracker, "CheckpointStatsTracker");

         //checkpoint失败manager，若是checkpoint失败会根据设置来决定下一步

         CheckpointFailureManager failureManager = new CheckpointFailureManager(

             chkConfig.getTolerableCheckpointFailureNumber(),

             new CheckpointFailureManager.FailJobCallback() {

                 @Override

                 public void failJob(Throwable cause) {

                     getJobMasterMainThreadExecutor().execute(() -> failGlobal(cause));

                 }

                 @Override

                 public void failJobDueToTaskFailure(Throwable cause, ExecutionAttemptID failingTask) {

                     getJobMasterMainThreadExecutor().execute(() -> failGlobalIfExecutionIsStillRunning(cause, failingTask));

                 }

             }

         );

         // create the coordinator that triggers and commits checkpoints and holds the state

         //checkpoint的核心类CheckpointCoordinator

         checkpointCoordinator = new CheckpointCoordinator(

             jobInformation.getJobId(),

             chkConfig,

             tasksToTrigger,

             tasksToWaitFor,

             tasksToCommitTo,

             checkpointIDCounter,

             checkpointStore,

             checkpointStateBackend,

             ioExecutor,

             SharedStateRegistry.DEFAULT_FACTORY,

             failureManager);

         // register the master hooks on the checkpoint coordinator

         for (MasterTriggerRestoreHook<?> hook : masterHooks) {

             if (!checkpointCoordinator.addMasterHook(hook)) {

                 LOG.warn("Trying to register multiple checkpoint hooks with the name: {}", hook.getIdentifier());

             }

         }

         //checkpoint统计

         checkpointCoordinator.setCheckpointStatsTracker(checkpointStatsTracker);

         // interval of max long value indicates disable periodic checkpoint,

         // the CheckpointActivatorDeactivator should be created only if the interval is not max value

         //设置为Long.MAX_VALUE标识关闭周期性的checkpoint

         if (chkConfig.getCheckpointInterval() != Long.MAX_VALUE) {

             // the periodic checkpoint scheduler is activated and deactivated as a result of

             // job status changes (running -> on, all other states -> off)

             //只有在job的状态为running时，才会开启checkpoint的scheduler

             //createActivatorDeactivator()创建一个listener监听器

             //registerJobStatusListener()将listener加入监听器集合jobStatusListeners中

             registerJobStatusListener(checkpointCoordinator.createActivatorDeactivator());

         }

     }

     //#CheckpointCoordinator.java

     / ------------------------------------------------------------------------

     //  job status listener that schedules / cancels periodic checkpoints

     // ------------------------------------------------------------------------

     //创建一个listener监听器checkpointCoordinator.createActivatorDeactivator()

     public JobStatusListener createActivatorDeactivator() {

         synchronized (lock) {

             if (shutdown) {

                 throw new IllegalArgumentException("Checkpoint coordinator is shut down");

             }

             if (jobStatusListener == null) {

                 jobStatusListener = new CheckpointCoordinatorDeActivator(this);

             }

             return jobStatusListener;

         }

     }

　　至此，createJobManagerRunner阶段结束了，ExecutionGraph中checkpoint的配置就设置好了。

　　2.2 startJobManagerRunner阶段

　　在该阶段中，在获得leaderShip之后，就会启动startJobExecution，这里只给出调用涉及的类和方法：

     //#JobManagerRunner.java类中

     //grantLeadership(...)==>verifyJobSchedulingStatusAndStartJobManager(...)

     //==>startJobMaster(...)，该方法中核心代码为

     startFuture = jobMasterService.start(new JobMasterId(leaderSessionId));

     //进一步调用#JobMaster.java类中的start()==>startJobExecution(...)

　　startJobExecution()方法是JobMaster类中的私有方法，具体代码分析如下：

 　　//----------------------------------------------------------------------------------------------

     // Internal methods

     //----------------------------------------------------------------------------------------------

     //-- job starting and stopping  -----------------------------------------------------------------

     private Acknowledge startJobExecution(JobMasterId newJobMasterId) throws Exception {

         validateRunsInMainThread();

         checkNotNull(newJobMasterId, "The new JobMasterId must not be null.");

         if (Objects.equals(getFencingToken(), newJobMasterId)) {

             log.info("Already started the job execution with JobMasterId {}.", newJobMasterId);

             return Acknowledge.get();

         }

         setNewFencingToken(newJobMasterId);

         //启动slotPool并申请资源，该方法可以具体看看申请资源的过程

         startJobMasterServices();

         log.info("Starting execution of job {} ({}) under job master id {}.", jobGraph.getName(), jobGraph.getJobID(), newJobMasterId);

         //执行ExecuteGraph的切入口，先判断job的状态是否为created的，后调执行executionGraph.scheduleForExecution();

         resetAndStartScheduler();

         return Acknowledge.get();

     }

　　在LegacyScheduler类中的方法scheduleForExecution()调度过程如下：

     public void scheduleForExecution() throws JobException {

         assertRunningInJobMasterMainThread();

         final long currentGlobalModVersion = globalModVersion;

         //任务执行之前进行状态切换从CREATED到RUNNING，

         //transitionState(...)方法中会通过notifyJobStatusChange(newState, error)通知jobStatusListeners集合中listeners状态改变

         if (transitionState(JobStatus.CREATED, JobStatus.RUNNING)) {

             //根据启动算子调度模式不同，采用不同的调度方案

             final CompletableFuture<Void> newSchedulingFuture = SchedulingUtils.schedule(

                 scheduleMode,

                 getAllExecutionVertices(),

                 this);

             //..............

         }

         else {

             throw new IllegalStateException("Job may only be scheduled from state " + JobStatus.CREATED);

         }

     }

     private void notifyJobStatusChange(JobStatus newState, Throwable error) {

         if (jobStatusListeners.size() > 0) {

             final long timestamp = System.currentTimeMillis();

             final Throwable serializedError = error == null ? null : new SerializedThrowable(error);

             for (JobStatusListener listener : jobStatusListeners) {

                 try {

                     listener.jobStatusChanges(getJobID(), newState, timestamp, serializedError);

                 } catch (Throwable t) {

                     LOG.warn("Error while notifying JobStatusListener", t);

                 }

             }

         }

     }

     //#CheckpointCoordinatorDeActivator.java

     public void jobStatusChanges(JobID jobId, JobStatus newJobStatus, long timestamp, Throwable error) {

         if (newJobStatus == JobStatus.RUNNING) {

             // start the checkpoint scheduler

             //触发checkpoint的核心方法

             coordinator.startCheckpointScheduler();

         } else {

             // anything else should stop the trigger for now

             coordinator.stopCheckpointScheduler();

         }

     }

　　下面具体分析触发checkpoint的核心方法startCheckpointScheduler()。

　　startCheckpointScheduler()方法结合注释还是比较好理解的，但由于方法太长这里就不全部贴出来了，先分析一下大致做什么了，然后给出其核心代码：

　　1）检查触发checkpoint的条件。如coordinator被关闭、周期性checkpoint被禁止、在没有开启强制checkpoint的情况下没有达到最小的checkpoint间隔以及超过并发的checkpoint个数等；

　　2）检查是否所有需要checkpoint和需要响应checkpoint的ACK（的task都处于running状态，否则抛出异常；

　　3）若均符合，执行checkpointID = checkpointIdCounter.getAndIncrement();以生成一个新的checkpointID，然后生成一个PendingCheckpoint。其中，PendingCheckpoint仅是一个启动了的checkpoint，但是还没有被确认，直到所有的task都确认了本次checkpoint，该checkpoint对象才转化为一个CompletedCheckpoint；

　　4）调度timer清理失败的checkpoint；

　　5）定义一个超时callback，如果checkpoint执行了很久还没完成，就把它取消；

　　6）触发MasterHooks，用户可以定义一些额外的操作，用以增强checkpoint的功能（如准备和清理外部资源）；

　　核心代码如下：

     // send the messages to the tasks that trigger their checkpoint

     //遍历ExecutionVertex，是否异步触发checkpoint

     for (Execution execution: executions) {

         if (props.isSynchronous()) {

             execution.triggerSynchronousSavepoint(checkpointID, timestamp, checkpointOptions, advanceToEndOfTime);

         } else {

             execution.triggerCheckpoint(checkpointID, timestamp, checkpointOptions);

         }

     }

　　不管是否以异步的方式触发checkpoint，最终调用的方法是Execution类中的私有方法triggerCheckpointHelper(...)，具体代码如下：

 　　//Execution.java

     private void triggerCheckpointHelper(long checkpointId, long timestamp, CheckpointOptions checkpointOptions, boolean advanceToEndOfEventTime) {

         final CheckpointType checkpointType = checkpointOptions.getCheckpointType();

         if (advanceToEndOfEventTime && !(checkpointType.isSynchronous() && checkpointType.isSavepoint())) {

             throw new IllegalArgumentException("Only synchronous savepoints are allowed to advance the watermark to MAX.");

         }

         final LogicalSlot slot = assignedResource;

         if (slot != null) {

             //TaskManagerGateway是用于与taskManager通信的组件

             final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway();

             taskManagerGateway.triggerCheckpoint(attemptId, getVertex().getJobId(), checkpointId, timestamp, checkpointOptions, advanceToEndOfEventTime);

         } else {

             LOG.debug("The execution has no slot assigned. This indicates that the execution is no longer running.");

         }

     }

　　至此，checkpointCoordinator就将做checkpoint的命令发送到TaskManager去了，下面着重分析TM中checkpoint的执行过程。

　　2.3 TaskManager中checkpoint

　　TaskManager 接收到触发checkpoint的RPC后，会触发生成checkpoint barrier。RpcTaskManagerGateway作为消息入口，其triggerCheckpoint(...)会调用TaskExecutor的triggerCheckpoint(...)，具体过程如下：

 　　//RpcTaskManagerGateway.java

     public void triggerCheckpoint(ExecutionAttemptID executionAttemptID, JobID jobId, long checkpointId, long timestamp, CheckpointOptions checkpointOptions, boolean advanceToEndOfEventTime) {

         taskExecutorGateway.triggerCheckpoint(

             executionAttemptID,

             checkpointId,

             timestamp,

             checkpointOptions,

             advanceToEndOfEventTime);

     }

     //TaskExecutor.java

     @Override

     public CompletableFuture<Acknowledge> triggerCheckpoint(

             ExecutionAttemptID executionAttemptID,

             long checkpointId,

             long checkpointTimestamp,

             CheckpointOptions checkpointOptions,

             boolean advanceToEndOfEventTime) {

         log.debug("Trigger checkpoint {}@{} for {}.", checkpointId, checkpointTimestamp, executionAttemptID);

         //...........

         if (task != null) {

             //核心方法，触发生成barrier

             task.triggerCheckpointBarrier(checkpointId, checkpointTimestamp, checkpointOptions, advanceToEndOfEventTime);

             return CompletableFuture.completedFuture(Acknowledge.get());

         } else {

             final String message = "TaskManager received a checkpoint request for unknown task " + executionAttemptID + '.';

             //.........

         }

     }

　　在Task类的triggerCheckpointBarrier(...)方法中生成了一个Runable匿名类用于执行checkpoint，然后以异步的方式触发了该Runable，具体代码如下：

 　　　　public void triggerCheckpointBarrier(

             final long checkpointID,

             final long checkpointTimestamp,

             final CheckpointOptions checkpointOptions,

             final boolean advanceToEndOfEventTime) {

         final AbstractInvokable invokable = this.invokable;

         //创建一个CheckpointMetaData，该对象仅有checkpointID、checkpointTimestamp两个属性

         final CheckpointMetaData checkpointMetaData = new CheckpointMetaData(checkpointID, checkpointTimestamp);

         if (executionState == ExecutionState.RUNNING && invokable != null) {

             //..............

             Runnable runnable = new Runnable() {

                 @Override

                 public void run() {

                     // set safety net from the task's context for checkpointing thread

                     LOG.debug("Creating FileSystem stream leak safety net for {}", Thread.currentThread().getName());

                     FileSystemSafetyNet.setSafetyNetCloseableRegistryForThread(safetyNetCloseableRegistry);

                     try {

                         //根据SourceStreamTask和StreamTask调用不同的方法

                         boolean success = invokable.triggerCheckpoint(checkpointMetaData, checkpointOptions, advanceToEndOfEventTime);

                         if (!success) {

                             checkpointResponder.declineCheckpoint(

                                     getJobID(), getExecutionId(), checkpointID,

                                     new CheckpointException("Task Name" + taskName, CheckpointFailureReason.CHECKPOINT_DECLINED_TASK_NOT_READY));

                         }

                     }

                     catch (Throwable t) {

                         if (getExecutionState() == ExecutionState.RUNNING) {

                             failExternally(new Exception(

                                 "Error while triggering checkpoint " + checkpointID + " for " +

                                     taskNameWithSubtask, t));

                         } else {

                             LOG.debug("Encountered error while triggering checkpoint {} for " +

                                 "{} ({}) while being not in state running.", checkpointID,

                                 taskNameWithSubtask, executionId, t);

                         }

                     } finally {

                         FileSystemSafetyNet.setSafetyNetCloseableRegistryForThread(null);

                     }

                 }

             };

             //以异步的方式触发Runnable

             executeAsyncCallRunnable(

                     runnable,

                     String.format("Checkpoint Trigger for %s (%s).", taskNameWithSubtask, executionId));

         }

         else {

             LOG.debug("Declining checkpoint request for non-running task {} ({}).", taskNameWithSubtask, executionId);

             // send back a message that we did not do the checkpoint

             checkpointResponder.declineCheckpoint(jobId, executionId, checkpointID,

                     new CheckpointException("Task name with subtask : " + taskNameWithSubtask, CheckpointFailureReason.CHECKPOINT_DECLINED_TASK_NOT_READY));

         }

     }

　　SourceStreamTask和StreamTask调用triggerCheckpoint最终都是调用StreamTask类中的triggerCheckpoint(...)方法，其核心代码为：

 　　//#StreamTask.java

     return performCheckpoint(checkpointMetaData, checkpointOptions, checkpointMetrics, advanceToEndOfEventTime);

　　在performCheckpoint(...)方法中，主要有以下两件事：

　　1、若task是running，则可以进行checkpoint，主要有以下三件事：

　　　　1）为checkpoint做准备，一般是什么不做的，直接接受checkpoint；

　　　　2）生成barrier，并以广播的形式发射到下游去；

　　　　3）触发本task保存state；

　　2、若不是running，通知下游取消本次checkpoint，方法是发送一个CancelCheckpointMarker，这是类似于Barrier的另一种消息。

　　具体代码如下：

 　　//#StreamTask.java

     private boolean performCheckpoint(

             CheckpointMetaData checkpointMetaData,

             CheckpointOptions checkpointOptions,

             CheckpointMetrics checkpointMetrics,

             boolean advanceToEndOfTime) throws Exception {

         //......

         synchronized (lock) {

             if (isRunning) {

                 if (checkpointOptions.getCheckpointType().isSynchronous()) {

                     syncSavepointLatch.setCheckpointId(checkpointId);

                     if (advanceToEndOfTime) {

                         advanceToEndOfEventTime();

                     }

                 }

                 // All of the following steps happen as an atomic step from the perspective of barriers and

                 // records/watermarks/timers/callbacks.

                 // We generally try to emit the checkpoint barrier as soon as possible to not affect downstream

                 // checkpoint alignments

                 // Step (1): Prepare the checkpoint, allow operators to do some pre-barrier work.

                 //           The pre-barrier work should be nothing or minimal in the common case.

                 operatorChain.prepareSnapshotPreBarrier(checkpointId);

                 // Step (2): Send the checkpoint barrier downstream

                 operatorChain.broadcastCheckpointBarrier(

                         checkpointId,

                         checkpointMetaData.getTimestamp(),

                         checkpointOptions);

                 // Step (3): Take the state snapshot. This should be largely asynchronous, to not

                 //           impact progress of the streaming topology

                 checkpointState(checkpointMetaData, checkpointOptions, checkpointMetrics);

                 return true;

             }

             else {

                 //.......

             }

         }

     }

　　接下来分析checkpointState(...)过程。

　　checkpointState(...)方法最终会调用StreamTask类中executeCheckpointing()，其中会创建一个异步对象AsyncCheckpointRunnable，用以报告该检查点已完成，关键代码如下：

 　　//#StreamTask.java类中executeCheckpointing()

     public void executeCheckpointing() throws Exception {

             startSyncPartNano = System.nanoTime();

             try {

                 //调用StreamOperator进行snapshotState的入口方法，依算子不同而变

                 for (StreamOperator<?> op : allOperators) {

                     checkpointStreamOperator(op);

                 }

                 //.........

                 // we are transferring ownership over snapshotInProgressList for cleanup to the thread, active on submit

                 AsyncCheckpointRunnable asyncCheckpointRunnable = new AsyncCheckpointRunnable(

                     owner,

                     operatorSnapshotsInProgress,

                     checkpointMetaData,

                     checkpointMetrics,

                     startAsyncPartNano);

                 owner.cancelables.registerCloseable(asyncCheckpointRunnable);

                 owner.asyncOperationsThreadPool.execute(asyncCheckpointRunnable);

                 //.........

             } catch (Exception ex) {

                 //.......

             }

         }

　　进入AsyncCheckpointRunnable(...)中的run()方法，其中会调用StreamTask类中reportCompletedSnapshotStates(...)（对于一个无状态的job返回的null），进而调用TaskStateManagerImpl类中的reportTaskStateSnapshots(...)将TM的checkpoint汇报给JM，关键代码如下：

     //TaskStateManagerImpl.java

     checkpointResponder.acknowledgeCheckpoint(

             jobId,

             executionAttemptID,

             checkpointId,

             checkpointMetrics,

             acknowledgedState);

　　其逻辑是逻辑是通过rpc的方式远程调JobManager的相关方法完成报告事件。

　　2.4 JobManager处理checkpoint

　　通过RpcCheckpointResponder类中acknowledgeCheckpoint(...)来响应checkpoint返回的消息，该方法之后的调度过程和涉及的核心方法如下：

 　　 //#JobMaster类中acknowledgeCheckpoint==>

     //#LegacyScheduler类中acknowledgeCheckpoint==>

     //#CheckpointCoordinator类中receiveAcknowledgeMessage(...)==>

     //completePendingCheckpoint(checkpoint);

     //<p>Important: This method should only be called in the checkpoint lock scope

     private void completePendingCheckpoint(PendingCheckpoint pendingCheckpoint) throws CheckpointException {

         final long checkpointId = pendingCheckpoint.getCheckpointId();

         final CompletedCheckpoint completedCheckpoint;

         // As a first step to complete the checkpoint, we register its state with the registry

         Map<OperatorID, OperatorState> operatorStates = pendingCheckpoint.getOperatorStates();

         sharedStateRegistry.registerAll(operatorStates.values());

         try {

             try {

                 //完成checkpoint

                 completedCheckpoint = pendingCheckpoint.finalizeCheckpoint();

                 failureManager.handleCheckpointSuccess(pendingCheckpoint.getCheckpointId());

             }

             catch (Exception e1) {

                 // abort the current pending checkpoint if we fails to finalize the pending checkpoint.

                 if (!pendingCheckpoint.isDiscarded()) {

                     failPendingCheckpoint(pendingCheckpoint, CheckpointFailureReason.FINALIZE_CHECKPOINT_FAILURE, e1);

                 }

                 throw new CheckpointException("Could not finalize the pending checkpoint " + checkpointId + '.',

                     CheckpointFailureReason.FINALIZE_CHECKPOINT_FAILURE, e1);

             }

             // the pending checkpoint must be discarded after the finalization

             Preconditions.checkState(pendingCheckpoint.isDiscarded() && completedCheckpoint != null);

             try {

                 //添加新的checkpoints，若有必要（completedCheckpoints.size() > maxNumberOfCheckpointsToRetain）删除旧的

                 completedCheckpointStore.addCheckpoint(completedCheckpoint);

             } catch (Exception exception) {

                 // we failed to store the completed checkpoint. Let's clean up

                 executor.execute(new Runnable() {

                     @Override

                     public void run() {

                         try {

                             completedCheckpoint.discardOnFailedStoring();

                         } catch (Throwable t) {

                             LOG.warn("Could not properly discard completed checkpoint {}.", completedCheckpoint.getCheckpointID(), t);

                         }

                     }

                 });

                 throw new CheckpointException("Could not complete the pending checkpoint " + checkpointId + '.',

                     CheckpointFailureReason.FINALIZE_CHECKPOINT_FAILURE, exception);

             }

         } finally {

             pendingCheckpoints.remove(checkpointId);

             triggerQueuedRequests();

         }

         rememberRecentCheckpointId(checkpointId);

         // drop those pending checkpoints that are at prior to the completed one

         //删除在其之前未完成的checkpoint（优先级高的）

         dropSubsumedCheckpoints(checkpointId);

         // record the time when this was completed, to calculate

         // the 'min delay between checkpoints'

         lastCheckpointCompletionNanos = System.nanoTime();

         LOG.info("Completed checkpoint {} for job {} ({} bytes in {} ms).", checkpointId, job,

             completedCheckpoint.getStateSize(), completedCheckpoint.getDuration());

         if (LOG.isDebugEnabled()) {

             StringBuilder builder = new StringBuilder();

             builder.append("Checkpoint state: ");

             for (OperatorState state : completedCheckpoint.getOperatorStates().values()) {

                 builder.append(state);

                 builder.append(", ");

             }

             // Remove last two chars ", "

             builder.setLength(builder.length() - 2);

             LOG.debug(builder.toString());

         }

         // send the "notify complete" call to all vertices

         final long timestamp = completedCheckpoint.getTimestamp();

         //通知所有（TM中）operator该checkpoint已完成

         for (ExecutionVertex ev : tasksToCommitTo) {

             Execution ee = ev.getCurrentExecutionAttempt();

             if (ee != null) {

                 ee.notifyCheckpointComplete(checkpointId, timestamp);

             }

         }

     }

　　至此，checkpoint的整体流程分析完毕建议结合原理去理解，参考的三篇文献都是写的很好的，有时间建议看看。

Ref：

[1]https://www.jianshu.com/p/a40a1b92f6a2

[2]https://www.cnblogs.com/bethunebtj/p/9168274.html

[3] https://blog.csdn.net/qq475781638/article/details/92698301

Flink源码阅读（二）——checkpoint源码分析的更多相关文章

xxl-job源码阅读二（服务端）
1.源码入口 xxl-job-admin是一个简单的springboot工程,简单翻看源码,可以很快发现XxlJobAdminConfig入口. @Override public void after ...
Spring 源码阅读二
程序入口: 接着上一篇博客中看完了在AnnotationConfigApplicationContext的构造函数中的register(annotatedClasses);将我们传递进来的主配置类添加 ...
SparkConf加载与SparkContext创建（源码阅读二）
紧接着昨天,我们继续开搞了啊.. 1.下面,开始创建BroadcastManager,就是传说中的广播变量管理器.BroadcastManager用于将配置信息和序列化后的RDD.Job以及Shuff ...
JDK源码阅读(二) AbstractList
package java.util; public abstract class AbstractList<E> extends AbstractCollection<E> i ...
搜索引擎Hoot的源码阅读（提供源码）
开门见山,最近阅读了一下一款开源引擎的源码,受益良多(学到了一些套路).外加好久没有写博客了(沉迷吃鸡,沉迷想念姑娘),特别开一篇.Hoot 的源码地址, 原理介绍地址.外加我看过之后的注释版本,当然 ...
spring源码解析(二) 结合源码聊聊FactoryBean
一.什么是FactoryBean FactoryBean是由spring提供的用来让用户可以自定bean创建的接口:实现该接口可以让你的bean不用经过spring复杂的bean创建过程,但同时也能做 ...
Spring源码阅读之搭建源码阅读环境（IDEA）
检出源码: GitHub:https://github.com/spring-projects/spring-framework.git 可以按如下步骤:(须确保Git已正确安装) Git正确安装后, ...
Redis源码阅读-sds字符串源码阅读
redis使用sds代替char *字符串, 其定义如下: typedef char *sds; struct sdshdr { unsigned int len; unsigned int free ...
Struts2源码阅读(一)_Struts2框架流程概述
1. Struts2架构图当外部的httpservletrequest到来时 ,初始到了servlet容器(所以虽然Servlet和Action是解耦合的,但是Action依旧能够通过httpse ...
转-OpenJDK源码阅读导航跟编译
OpenJDK源码阅读导航 OpenJDK源码阅读导航博客分类: Virtual Machine HotSpot VM Java OpenJDK openjdk 这是链接帖.主体内容都在各链接中. ...

随机推荐

nginx自定义错误页面
这里配置注意是在 server 443端口 ,蓝色部分为主要部分.这个server不是全部代码. server{ #监听443端口 listen ; #对应的域名,把baofeidyz.com改成你们 ...
PAT 甲级 1075 PAT Judge (25分)（较简单，注意细节）
1075 PAT Judge (25分) The ranklist of PAT is generated from the status list, which shows the scores ...
python语言使用yaml 管理selenium元素
1.所有元素都在PageElement下的.yaml,如图 login_page.yaml文件: username: dec: 登录页 type: xpath value: //input[@clas ...
Kubernetes StatefulSet
StatefulSet 简介在Kubernetes中,Pod的管理对象RC.Deployment.DaemonSet和Job都是面向无状态的服务.但现实中有很多服务是有状态的,特别是一些复杂的中间件 ...
QT QML 在qml中自定义信号
服从真理,就能征服一切事物. -- 塞涅卡实例: 自定义文件 MoveYou.qml: import QtQuick 2.5 import QtQuick.Controls 1.4 import Q ...
LeetCode 611. 有效三角形的个数(Valid Triangle Number)
611. 有效三角形的个数 611. Valid Triangle Number 题目描述 LeetCode LeetCode LeetCode611. Valid Triangle Number中等 ...
cetos7 Mysql5.7安装
centos7 MySQL安装一.检查是否已经存在mysql,若存在卸载,避免安装时产生一些错误 rpm -qa | grep -i mysql 若存在,使用rpm -e packname –nod ...
CRT远程连接centos7，连接超时
CRT远程连接centos7,连接超时问题原因: 宿主机(win10)和虚拟机(centos7)不在同一个网段在宿主机无法ping通虚拟机, 首先在cmd窗口ipconfig查看一下vmnet的i ...
Linux中的13个基本Cat命令示例
cat(“ concatenate ”的缩写)命令是Linux / Unix等操作系统中最常用的命令之一.cat命令允许我们创建单个或多个文件,查看文件包含,连接文件以及在终端或文件中重定向输出.在本 ...
pytorch 0.4.0迁移指南
总说由于pytorch 0.4版本更新实在太大了, 以前版本的代码必须有一定程度的更新. 主要的更新在于 Variable和Tensor的合并., 当然还有Windows的支持, 其他一些就是支持s ...

Flink源码阅读（二）——checkpoint源码分析

前言

1、参数设置

1.1 有关checkpoint常见的参数如下：

1.2 参数分析

2、checkpoint调用过程

2.1 createJobManagerRunner阶段

2.2 startJobManagerRunner阶段

2.3 TaskManager中checkpoint

2.4 JobManager处理checkpoint

Flink源码阅读（二）——checkpoint源码分析的更多相关文章

随机推荐

热门专题

　　1.1 有关checkpoint常见的参数如下：

　　1.2 参数分析

　　2.1 createJobManagerRunner阶段

　　2.2 startJobManagerRunner阶段

　　2.3 TaskManager中checkpoint

　　2.4 JobManager处理checkpoint