Flink源码阅读(二)——checkpoint源码分析
前言
在Flink原理——容错机制一文中,已对checkpoint的机制有了较为基础的介绍,本文着重从源码方面去分析checkpoint的过程。当然本文只是分析做checkpoint的调度过程,只是尽量弄清楚整体的逻辑,没有弄清楚其实现细节,还是有遗憾的,后期还是努力去分析实现细节。文中若是有误,欢迎大伙留言指出!
本文基于Flink1.9。
1、参数设置
1.1 有关checkpoint常见的参数如下:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(10000); //默认是不开启的
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); //默认为EXACTLY_ONCE
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(5000); //默认为0,最大值为1年
env.getCheckpointConfig().setCheckpointTimeout(150000); //默认为10min
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1); //默认为1
上述参数的默认值可见flink-streaming-java*.jar中的CheckpointConfig.java,配置值是通过该类中私有configureCheckpointing()的jobGraph.setSnapshotSettings(settings)传递给runtime层的,更多设置也可以参见该类。
1.2 参数分析
这里着重分析enableCheckpointing()设置的baseInterval和minPauseBetweenCheckpoint之间的关系。为分析两者的关系,这里先给出源码中定义
/** The base checkpoint interval. Actual trigger time may be affected by the
* max concurrent checkpoints and minimum-pause values */
//checkpoint触发周期,时间触发时间还受maxConcurrentCheckpointAttempts和minPauseBetweenCheckpointsNanos影响
private final long baseInterval; /** The min time(in ns) to delay after a checkpoint could be triggered. Allows to
* enforce minimum processing time between checkpoint attempts */
//在可以触发checkpoint的时,两次checkpoint之间的时间间隔
private final long minPauseBetweenCheckpointsNanos;
当baseInterval<minPauseBetweenCheckpoint时,在CheckpointCoordinator.java源码中定义如下:
// it does not make sense to schedule checkpoints more often then the desired
// time between checkpoints
long baseInterval = chkConfig.getCheckpointInterval();
if (baseInterval < minPauseBetweenCheckpoints) {
baseInterval = minPauseBetweenCheckpoints;
}
从此可以看出,checkpoint的触发虽然设置为周期性的,但是实际触发情况,还得考虑minPauseBetweenCheckpoint和maxConcurrentCheckpointAttempts,若maxConcurrentCheckpointAttempts为1,就算满足触发时间也需等待正在执行的checkpoint结束。
2、checkpoint调用过程
将JobGraph提交到Dispatcher后,会createJobManagerRunner和startJobManagerRunner,可以关注Dispatcher类中的createJobManagerRunner(...)方法。
2.1 createJobManagerRunner阶段
该阶段会创建一个JobManagerRunner实例,在该过程和checkpoint有关的是会启动listener去监听job的状态。
#JobManagerRunner.java
public JobManagerRunner(...) throws Exception { //.......... // make sure we cleanly shut down out JobManager services if initialization fails
try {
//..........
//加载JobGraph、library、leader选举等 // now start the JobManager
//启动JobManager
this.jobMasterService = jobMasterFactory.createJobMasterService(jobGraph, this, userCodeLoader);
}
catch (Throwable t) {
//......
}
} //在DefaultJobMasterServiceFactory类的createJobMasterService()中新建一个JobMaster对象
//#JobMaster.java
public JobMaster(...) throws Exception { //........
//该方法中主要做了参数检查,slotPool的创建、slotPool的schedul的创建等一系列的事情 //创建一个调度器
this.schedulerNG = createScheduler(jobManagerJobMetricGroup);
//......
}
在创建调度器中核心的语句如下:
//#LegacyScheduler.java中的LegacyScheduler()
//创建ExecutionGraph
this.executionGraph = createAndRestoreExecutionGraph(jobManagerJobMetricGroup, checkNotNull(shuffleMaster), checkNotNull(partitionTracker));
private ExecutionGraph createAndRestoreExecutionGraph(
JobManagerJobMetricGroup currentJobManagerJobMetricGroup,
ShuffleMaster<?> shuffleMaster,
PartitionTracker partitionTracker) throws Exception { ExecutionGraph newExecutionGraph = createExecutionGraph(currentJobManagerJobMetricGroup, shuffleMaster, partitionTracker); final CheckpointCoordinator checkpointCoordinator = newExecutionGraph.getCheckpointCoordinator(); if (checkpointCoordinator != null) {
// check whether we find a valid checkpoint
//若state没有被恢复是否可以通过savepoint恢复
//......
}
} return newExecutionGraph;
}
通过调用到达生成ExecutionGraph的核心类ExecutionGraphBuilder的在buildGraph()方法,其中该方法主要是生成ExecutionGraph和设置checkpoint,下面给出其中的核心代码:
1 //..............
//生成ExecutionGraph的核心方法,这里后期会详细分析
executionGraph.attachJobGraph(sortedTopology); //....................... //在enableCheckpointing中设置CheckpointCoordinator
executionGraph.enableCheckpointing(
chkConfig,
triggerVertices,
ackVertices,
confirmVertices,
hooks,
checkpointIdCounter,
completedCheckpoints,
rootBackend,
checkpointStatsTracker);
在enableCheckpointing()方法中主要是创建了checkpoint失败是的manager、设置了checkpoint的核心类CheckpointCoordinator。
//#ExecutionGraph.java
public void enableCheckpointing(
CheckpointCoordinatorConfiguration chkConfig,
List<ExecutionJobVertex> verticesToTrigger,
List<ExecutionJobVertex> verticesToWaitFor,
List<ExecutionJobVertex> verticesToCommitTo,
List<MasterTriggerRestoreHook<?>> masterHooks,
CheckpointIDCounter checkpointIDCounter,
CompletedCheckpointStore checkpointStore,
StateBackend checkpointStateBackend,
CheckpointStatsTracker statsTracker) {
//Job的状态必须为Created,
checkState(state == JobStatus.CREATED, "Job must be in CREATED state");
checkState(checkpointCoordinator == null, "checkpointing already enabled");
//checkpointing的不同状态
ExecutionVertex[] tasksToTrigger = collectExecutionVertices(verticesToTrigger);
ExecutionVertex[] tasksToWaitFor = collectExecutionVertices(verticesToWaitFor);
ExecutionVertex[] tasksToCommitTo = collectExecutionVertices(verticesToCommitTo); checkpointStatsTracker = checkNotNull(statsTracker, "CheckpointStatsTracker");
//checkpoint失败manager,若是checkpoint失败会根据设置来决定下一步
CheckpointFailureManager failureManager = new CheckpointFailureManager(
chkConfig.getTolerableCheckpointFailureNumber(),
new CheckpointFailureManager.FailJobCallback() {
@Override
public void failJob(Throwable cause) {
getJobMasterMainThreadExecutor().execute(() -> failGlobal(cause));
} @Override
public void failJobDueToTaskFailure(Throwable cause, ExecutionAttemptID failingTask) {
getJobMasterMainThreadExecutor().execute(() -> failGlobalIfExecutionIsStillRunning(cause, failingTask));
}
}
); // create the coordinator that triggers and commits checkpoints and holds the state
//checkpoint的核心类CheckpointCoordinator
checkpointCoordinator = new CheckpointCoordinator(
jobInformation.getJobId(),
chkConfig,
tasksToTrigger,
tasksToWaitFor,
tasksToCommitTo,
checkpointIDCounter,
checkpointStore,
checkpointStateBackend,
ioExecutor,
SharedStateRegistry.DEFAULT_FACTORY,
failureManager); // register the master hooks on the checkpoint coordinator
for (MasterTriggerRestoreHook<?> hook : masterHooks) {
if (!checkpointCoordinator.addMasterHook(hook)) {
LOG.warn("Trying to register multiple checkpoint hooks with the name: {}", hook.getIdentifier());
}
}
//checkpoint统计
checkpointCoordinator.setCheckpointStatsTracker(checkpointStatsTracker); // interval of max long value indicates disable periodic checkpoint,
// the CheckpointActivatorDeactivator should be created only if the interval is not max value
//设置为Long.MAX_VALUE标识关闭周期性的checkpoint
if (chkConfig.getCheckpointInterval() != Long.MAX_VALUE) {
// the periodic checkpoint scheduler is activated and deactivated as a result of
// job status changes (running -> on, all other states -> off)
//只有在job的状态为running时,才会开启checkpoint的scheduler
//createActivatorDeactivator()创建一个listener监听器
//registerJobStatusListener()将listener加入监听器集合jobStatusListeners中
registerJobStatusListener(checkpointCoordinator.createActivatorDeactivator());
}
} //#CheckpointCoordinator.java
/ ------------------------------------------------------------------------
// job status listener that schedules / cancels periodic checkpoints
// ------------------------------------------------------------------------
//创建一个listener监听器checkpointCoordinator.createActivatorDeactivator()
public JobStatusListener createActivatorDeactivator() {
synchronized (lock) {
if (shutdown) {
throw new IllegalArgumentException("Checkpoint coordinator is shut down");
} if (jobStatusListener == null) {
jobStatusListener = new CheckpointCoordinatorDeActivator(this);
} return jobStatusListener;
}
}
至此,createJobManagerRunner阶段结束了,ExecutionGraph中checkpoint的配置就设置好了。
2.2 startJobManagerRunner阶段
在该阶段中,在获得leaderShip之后,就会启动startJobExecution,这里只给出调用涉及的类和方法:
//#JobManagerRunner.java类中
//grantLeadership(...)==>verifyJobSchedulingStatusAndStartJobManager(...)
//==>startJobMaster(...),该方法中核心代码为
startFuture = jobMasterService.start(new JobMasterId(leaderSessionId)); //进一步调用#JobMaster.java类中的start()==>startJobExecution(...)
startJobExecution()方法是JobMaster类中的私有方法,具体代码分析如下:
//----------------------------------------------------------------------------------------------
// Internal methods
//---------------------------------------------------------------------------------------------- //-- job starting and stopping ----------------------------------------------------------------- private Acknowledge startJobExecution(JobMasterId newJobMasterId) throws Exception { validateRunsInMainThread(); checkNotNull(newJobMasterId, "The new JobMasterId must not be null."); if (Objects.equals(getFencingToken(), newJobMasterId)) {
log.info("Already started the job execution with JobMasterId {}.", newJobMasterId); return Acknowledge.get();
} setNewFencingToken(newJobMasterId);
//启动slotPool并申请资源,该方法可以具体看看申请资源的过程
startJobMasterServices(); log.info("Starting execution of job {} ({}) under job master id {}.", jobGraph.getName(), jobGraph.getJobID(), newJobMasterId);
//执行ExecuteGraph的切入口,先判断job的状态是否为created的,后调执行executionGraph.scheduleForExecution();
resetAndStartScheduler(); return Acknowledge.get();
}
在LegacyScheduler类中的方法scheduleForExecution()调度过程如下:
public void scheduleForExecution() throws JobException { assertRunningInJobMasterMainThread(); final long currentGlobalModVersion = globalModVersion;
//任务执行之前进行状态切换从CREATED到RUNNING,
//transitionState(...)方法中会通过notifyJobStatusChange(newState, error)通知jobStatusListeners集合中listeners状态改变
if (transitionState(JobStatus.CREATED, JobStatus.RUNNING)) {
//根据启动算子调度模式不同,采用不同的调度方案
final CompletableFuture<Void> newSchedulingFuture = SchedulingUtils.schedule(
scheduleMode,
getAllExecutionVertices(),
this); //..............
}
else {
throw new IllegalStateException("Job may only be scheduled from state " + JobStatus.CREATED);
}
} private void notifyJobStatusChange(JobStatus newState, Throwable error) {
if (jobStatusListeners.size() > 0) {
final long timestamp = System.currentTimeMillis();
final Throwable serializedError = error == null ? null : new SerializedThrowable(error); for (JobStatusListener listener : jobStatusListeners) {
try {
listener.jobStatusChanges(getJobID(), newState, timestamp, serializedError);
} catch (Throwable t) {
LOG.warn("Error while notifying JobStatusListener", t);
}
}
}
} //#CheckpointCoordinatorDeActivator.java
public void jobStatusChanges(JobID jobId, JobStatus newJobStatus, long timestamp, Throwable error) {
if (newJobStatus == JobStatus.RUNNING) {
// start the checkpoint scheduler
//触发checkpoint的核心方法
coordinator.startCheckpointScheduler();
} else {
// anything else should stop the trigger for now
coordinator.stopCheckpointScheduler();
}
}
下面具体分析触发checkpoint的核心方法startCheckpointScheduler()。
startCheckpointScheduler()方法结合注释还是比较好理解的,但由于方法太长这里就不全部贴出来了,先分析一下大致做什么了,然后给出其核心代码:
1)检查触发checkpoint的条件。如coordinator被关闭、周期性checkpoint被禁止、在没有开启强制checkpoint的情况下没有达到最小的checkpoint间隔以及超过并发的checkpoint个数等;
2)检查是否所有需要checkpoint和需要响应checkpoint的ACK(的task都处于running状态,否则抛出异常;
3)若均符合,执行checkpointID = checkpointIdCounter.getAndIncrement();以生成一个新的checkpointID,然后生成一个PendingCheckpoint。其中,PendingCheckpoint仅是一个启动了的checkpoint,但是还没有被确认,直到所有的task都确认了本次checkpoint,该checkpoint对象才转化为一个CompletedCheckpoint;
4)调度timer清理失败的checkpoint;
5)定义一个超时callback,如果checkpoint执行了很久还没完成,就把它取消;
6)触发MasterHooks,用户可以定义一些额外的操作,用以增强checkpoint的功能(如准备和清理外部资源);
核心代码如下:
// send the messages to the tasks that trigger their checkpoint
//遍历ExecutionVertex,是否异步触发checkpoint
for (Execution execution: executions) {
if (props.isSynchronous()) {
execution.triggerSynchronousSavepoint(checkpointID, timestamp, checkpointOptions, advanceToEndOfTime);
} else {
execution.triggerCheckpoint(checkpointID, timestamp, checkpointOptions);
}
}
不管是否以异步的方式触发checkpoint,最终调用的方法是Execution类中的私有方法triggerCheckpointHelper(...),具体代码如下:
//Execution.java
private void triggerCheckpointHelper(long checkpointId, long timestamp, CheckpointOptions checkpointOptions, boolean advanceToEndOfEventTime) { final CheckpointType checkpointType = checkpointOptions.getCheckpointType();
if (advanceToEndOfEventTime && !(checkpointType.isSynchronous() && checkpointType.isSavepoint())) {
throw new IllegalArgumentException("Only synchronous savepoints are allowed to advance the watermark to MAX.");
} final LogicalSlot slot = assignedResource; if (slot != null) {
//TaskManagerGateway是用于与taskManager通信的组件
final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway(); taskManagerGateway.triggerCheckpoint(attemptId, getVertex().getJobId(), checkpointId, timestamp, checkpointOptions, advanceToEndOfEventTime);
} else {
LOG.debug("The execution has no slot assigned. This indicates that the execution is no longer running.");
}
}
至此,checkpointCoordinator就将做checkpoint的命令发送到TaskManager去了,下面着重分析TM中checkpoint的执行过程。
2.3 TaskManager中checkpoint
TaskManager 接收到触发checkpoint的RPC后,会触发生成checkpoint barrier。RpcTaskManagerGateway作为消息入口,其triggerCheckpoint(...)会调用TaskExecutor的triggerCheckpoint(...),具体过程如下:
//RpcTaskManagerGateway.java
public void triggerCheckpoint(ExecutionAttemptID executionAttemptID, JobID jobId, long checkpointId, long timestamp, CheckpointOptions checkpointOptions, boolean advanceToEndOfEventTime) {
taskExecutorGateway.triggerCheckpoint(
executionAttemptID,
checkpointId,
timestamp,
checkpointOptions,
advanceToEndOfEventTime);
} //TaskExecutor.java
@Override
public CompletableFuture<Acknowledge> triggerCheckpoint(
ExecutionAttemptID executionAttemptID,
long checkpointId,
long checkpointTimestamp,
CheckpointOptions checkpointOptions,
boolean advanceToEndOfEventTime) {
log.debug("Trigger checkpoint {}@{} for {}.", checkpointId, checkpointTimestamp, executionAttemptID); //........... if (task != null) {
//核心方法,触发生成barrier
task.triggerCheckpointBarrier(checkpointId, checkpointTimestamp, checkpointOptions, advanceToEndOfEventTime); return CompletableFuture.completedFuture(Acknowledge.get());
} else {
final String message = "TaskManager received a checkpoint request for unknown task " + executionAttemptID + '.'; //.........
}
}
在Task类的triggerCheckpointBarrier(...)方法中生成了一个Runable匿名类用于执行checkpoint,然后以异步的方式触发了该Runable,具体代码如下:
public void triggerCheckpointBarrier(
final long checkpointID,
final long checkpointTimestamp,
final CheckpointOptions checkpointOptions,
final boolean advanceToEndOfEventTime) { final AbstractInvokable invokable = this.invokable;
//创建一个CheckpointMetaData,该对象仅有checkpointID、checkpointTimestamp两个属性
final CheckpointMetaData checkpointMetaData = new CheckpointMetaData(checkpointID, checkpointTimestamp); if (executionState == ExecutionState.RUNNING && invokable != null) { //.............. Runnable runnable = new Runnable() {
@Override
public void run() {
// set safety net from the task's context for checkpointing thread
LOG.debug("Creating FileSystem stream leak safety net for {}", Thread.currentThread().getName());
FileSystemSafetyNet.setSafetyNetCloseableRegistryForThread(safetyNetCloseableRegistry); try {
//根据SourceStreamTask和StreamTask调用不同的方法
boolean success = invokable.triggerCheckpoint(checkpointMetaData, checkpointOptions, advanceToEndOfEventTime);
if (!success) {
checkpointResponder.declineCheckpoint(
getJobID(), getExecutionId(), checkpointID,
new CheckpointException("Task Name" + taskName, CheckpointFailureReason.CHECKPOINT_DECLINED_TASK_NOT_READY));
}
}
catch (Throwable t) {
if (getExecutionState() == ExecutionState.RUNNING) {
failExternally(new Exception(
"Error while triggering checkpoint " + checkpointID + " for " +
taskNameWithSubtask, t));
} else {
LOG.debug("Encountered error while triggering checkpoint {} for " +
"{} ({}) while being not in state running.", checkpointID,
taskNameWithSubtask, executionId, t);
}
} finally {
FileSystemSafetyNet.setSafetyNetCloseableRegistryForThread(null);
}
}
};
//以异步的方式触发Runnable
executeAsyncCallRunnable(
runnable,
String.format("Checkpoint Trigger for %s (%s).", taskNameWithSubtask, executionId));
}
else {
LOG.debug("Declining checkpoint request for non-running task {} ({}).", taskNameWithSubtask, executionId); // send back a message that we did not do the checkpoint
checkpointResponder.declineCheckpoint(jobId, executionId, checkpointID,
new CheckpointException("Task name with subtask : " + taskNameWithSubtask, CheckpointFailureReason.CHECKPOINT_DECLINED_TASK_NOT_READY));
}
}
SourceStreamTask和StreamTask调用triggerCheckpoint最终都是调用StreamTask类中的triggerCheckpoint(...)方法,其核心代码为:
//#StreamTask.java
return performCheckpoint(checkpointMetaData, checkpointOptions, checkpointMetrics, advanceToEndOfEventTime);
在performCheckpoint(...)方法中,主要有以下两件事:
1、若task是running,则可以进行checkpoint,主要有以下三件事:
1)为checkpoint做准备,一般是什么不做的,直接接受checkpoint;
2)生成barrier,并以广播的形式发射到下游去;
3)触发本task保存state;
2、若不是running,通知下游取消本次checkpoint,方法是发送一个CancelCheckpointMarker,这是类似于Barrier的另一种消息。
具体代码如下:
//#StreamTask.java
private boolean performCheckpoint(
CheckpointMetaData checkpointMetaData,
CheckpointOptions checkpointOptions,
CheckpointMetrics checkpointMetrics,
boolean advanceToEndOfTime) throws Exception {
//...... synchronized (lock) {
if (isRunning) { if (checkpointOptions.getCheckpointType().isSynchronous()) {
syncSavepointLatch.setCheckpointId(checkpointId); if (advanceToEndOfTime) {
advanceToEndOfEventTime();
}
} // All of the following steps happen as an atomic step from the perspective of barriers and
// records/watermarks/timers/callbacks.
// We generally try to emit the checkpoint barrier as soon as possible to not affect downstream
// checkpoint alignments // Step (1): Prepare the checkpoint, allow operators to do some pre-barrier work.
// The pre-barrier work should be nothing or minimal in the common case.
operatorChain.prepareSnapshotPreBarrier(checkpointId); // Step (2): Send the checkpoint barrier downstream
operatorChain.broadcastCheckpointBarrier(
checkpointId,
checkpointMetaData.getTimestamp(),
checkpointOptions); // Step (3): Take the state snapshot. This should be largely asynchronous, to not
// impact progress of the streaming topology
checkpointState(checkpointMetaData, checkpointOptions, checkpointMetrics); return true;
}
else {
//.......
}
}
}
接下来分析checkpointState(...)过程。
checkpointState(...)方法最终会调用StreamTask类中executeCheckpointing(),其中会创建一个异步对象AsyncCheckpointRunnable,用以报告该检查点已完成,关键代码如下:
//#StreamTask.java类中executeCheckpointing()
public void executeCheckpointing() throws Exception {
startSyncPartNano = System.nanoTime(); try {
//调用StreamOperator进行snapshotState的入口方法,依算子不同而变
for (StreamOperator<?> op : allOperators) {
checkpointStreamOperator(op);
}
//......... // we are transferring ownership over snapshotInProgressList for cleanup to the thread, active on submit
AsyncCheckpointRunnable asyncCheckpointRunnable = new AsyncCheckpointRunnable(
owner,
operatorSnapshotsInProgress,
checkpointMetaData,
checkpointMetrics,
startAsyncPartNano); owner.cancelables.registerCloseable(asyncCheckpointRunnable);
owner.asyncOperationsThreadPool.execute(asyncCheckpointRunnable); //.........
} catch (Exception ex) {
//.......
}
}
进入AsyncCheckpointRunnable(...)中的run()方法,其中会调用StreamTask类中reportCompletedSnapshotStates(...)(对于一个无状态的job返回的null),进而调用TaskStateManagerImpl类中的reportTaskStateSnapshots(...)将TM的checkpoint汇报给JM,关键代码如下:
//TaskStateManagerImpl.java
checkpointResponder.acknowledgeCheckpoint(
jobId,
executionAttemptID,
checkpointId,
checkpointMetrics,
acknowledgedState);
其逻辑是逻辑是通过rpc的方式远程调JobManager的相关方法完成报告事件。
2.4 JobManager处理checkpoint
通过RpcCheckpointResponder类中acknowledgeCheckpoint(...)来响应checkpoint返回的消息,该方法之后的调度过程和涉及的核心方法如下:
//#JobMaster类中acknowledgeCheckpoint==>
//#LegacyScheduler类中acknowledgeCheckpoint==>
//#CheckpointCoordinator类中receiveAcknowledgeMessage(...)==>
//completePendingCheckpoint(checkpoint); //<p>Important: This method should only be called in the checkpoint lock scope
private void completePendingCheckpoint(PendingCheckpoint pendingCheckpoint) throws CheckpointException {
final long checkpointId = pendingCheckpoint.getCheckpointId();
final CompletedCheckpoint completedCheckpoint; // As a first step to complete the checkpoint, we register its state with the registry
Map<OperatorID, OperatorState> operatorStates = pendingCheckpoint.getOperatorStates();
sharedStateRegistry.registerAll(operatorStates.values()); try {
try {
//完成checkpoint
completedCheckpoint = pendingCheckpoint.finalizeCheckpoint();
failureManager.handleCheckpointSuccess(pendingCheckpoint.getCheckpointId());
}
catch (Exception e1) {
// abort the current pending checkpoint if we fails to finalize the pending checkpoint.
if (!pendingCheckpoint.isDiscarded()) {
failPendingCheckpoint(pendingCheckpoint, CheckpointFailureReason.FINALIZE_CHECKPOINT_FAILURE, e1);
} throw new CheckpointException("Could not finalize the pending checkpoint " + checkpointId + '.',
CheckpointFailureReason.FINALIZE_CHECKPOINT_FAILURE, e1);
} // the pending checkpoint must be discarded after the finalization
Preconditions.checkState(pendingCheckpoint.isDiscarded() && completedCheckpoint != null); try {
//添加新的checkpoints,若有必要(completedCheckpoints.size() > maxNumberOfCheckpointsToRetain)删除旧的
completedCheckpointStore.addCheckpoint(completedCheckpoint);
} catch (Exception exception) {
// we failed to store the completed checkpoint. Let's clean up
executor.execute(new Runnable() {
@Override
public void run() {
try {
completedCheckpoint.discardOnFailedStoring();
} catch (Throwable t) {
LOG.warn("Could not properly discard completed checkpoint {}.", completedCheckpoint.getCheckpointID(), t);
}
}
}); throw new CheckpointException("Could not complete the pending checkpoint " + checkpointId + '.',
CheckpointFailureReason.FINALIZE_CHECKPOINT_FAILURE, exception);
}
} finally {
pendingCheckpoints.remove(checkpointId); triggerQueuedRequests();
} rememberRecentCheckpointId(checkpointId); // drop those pending checkpoints that are at prior to the completed one
//删除在其之前未完成的checkpoint(优先级高的)
dropSubsumedCheckpoints(checkpointId); // record the time when this was completed, to calculate
// the 'min delay between checkpoints'
lastCheckpointCompletionNanos = System.nanoTime(); LOG.info("Completed checkpoint {} for job {} ({} bytes in {} ms).", checkpointId, job,
completedCheckpoint.getStateSize(), completedCheckpoint.getDuration()); if (LOG.isDebugEnabled()) {
StringBuilder builder = new StringBuilder();
builder.append("Checkpoint state: ");
for (OperatorState state : completedCheckpoint.getOperatorStates().values()) {
builder.append(state);
builder.append(", ");
}
// Remove last two chars ", "
builder.setLength(builder.length() - 2); LOG.debug(builder.toString());
} // send the "notify complete" call to all vertices
final long timestamp = completedCheckpoint.getTimestamp(); //通知所有(TM中)operator该checkpoint已完成
for (ExecutionVertex ev : tasksToCommitTo) {
Execution ee = ev.getCurrentExecutionAttempt();
if (ee != null) {
ee.notifyCheckpointComplete(checkpointId, timestamp);
}
}
}
至此,checkpoint的整体流程分析完毕建议结合原理去理解,参考的三篇文献都是写的很好的,有时间建议看看。
Ref:
[1]https://www.jianshu.com/p/a40a1b92f6a2
[2]https://www.cnblogs.com/bethunebtj/p/9168274.html
[3] https://blog.csdn.net/qq475781638/article/details/92698301
Flink源码阅读(二)——checkpoint源码分析的更多相关文章
- xxl-job源码阅读二(服务端)
1.源码入口 xxl-job-admin是一个简单的springboot工程,简单翻看源码,可以很快发现XxlJobAdminConfig入口. @Override public void after ...
- Spring 源码阅读 二
程序入口: 接着上一篇博客中看完了在AnnotationConfigApplicationContext的构造函数中的register(annotatedClasses);将我们传递进来的主配置类添加 ...
- SparkConf加载与SparkContext创建(源码阅读二)
紧接着昨天,我们继续开搞了啊.. 1.下面,开始创建BroadcastManager,就是传说中的广播变量管理器.BroadcastManager用于将配置信息和序列化后的RDD.Job以及Shuff ...
- JDK源码阅读(二) AbstractList
package java.util; public abstract class AbstractList<E> extends AbstractCollection<E> i ...
- 搜索引擎Hoot的源码阅读(提供源码)
开门见山,最近阅读了一下一款开源引擎的源码,受益良多(学到了一些套路).外加好久没有写博客了(沉迷吃鸡,沉迷想念姑娘),特别开一篇.Hoot 的源码地址, 原理介绍地址.外加我看过之后的注释版本,当然 ...
- spring源码解析(二) 结合源码聊聊FactoryBean
一.什么是FactoryBean FactoryBean是由spring提供的用来让用户可以自定bean创建的接口:实现该接口可以让你的bean不用经过spring复杂的bean创建过程,但同时也能做 ...
- Spring源码阅读 之 搭建源码阅读环境(IDEA)
检出源码: GitHub:https://github.com/spring-projects/spring-framework.git 可以按如下步骤:(须确保Git已正确安装) Git正确安装后, ...
- Redis源码阅读-sds字符串源码阅读
redis使用sds代替char *字符串, 其定义如下: typedef char *sds; struct sdshdr { unsigned int len; unsigned int free ...
- Struts2源码阅读(一)_Struts2框架流程概述
1. Struts2架构图 当外部的httpservletrequest到来时 ,初始到了servlet容器(所以虽然Servlet和Action是解耦合的,但是Action依旧能够通过httpse ...
- 转-OpenJDK源码阅读导航跟编译
OpenJDK源码阅读导航 OpenJDK源码阅读导航 博客分类: Virtual Machine HotSpot VM Java OpenJDK openjdk 这是链接帖.主体内容都在各链接中. ...
随机推荐
- Java3d 案例程序
今天偶尔翻出了很久以前写的java3d程序,很怀念曾经探索java3d解析.渲染ifc数据的日子 package com.vfsd.test0621; import java.applet.Apple ...
- Kubernetes 服务质量 Qos 解析 - Pod 资源 requests 和 limits 如何配置?
QoS是 Quality of Service 的缩写,即服务质量.为了实现资源被有效调度和分配的同时提高资源利用率,kubernetes针对不同服务质量的预期,通过 QoS(Quality of S ...
- POJ 1458 Common Subsequence(最长公共子序列)
题目链接Time Limit: 1000MS Memory Limit: 10000K Total Submissions: Accepted: Description A subsequence o ...
- python 实现微信发送消息
背景:利用Python来登入你个人的手机微信,之后向朋友发送消息,发送的消息可以来源于网页.下面的例子就是取得当前日元的汇率,之后发送自己的某一个朋友的手机上 环境:Python3,JetBrains ...
- java并发-ReentrantLock的lock和lockInterruptibly的区别
ReentrantLock的加锁方法Lock()提供了无条件地轮询获取锁的方式,lockInterruptibly()提供了可中断的锁获取方式.这两个方法的区别在哪里呢?通过分析源码可以知道lock方 ...
- LeetCode 445. 两数相加 II(Add Two Numbers II)
445. 两数相加 II 445. Add Two Numbers II 题目描述 给定两个非空链表来代表两个非负整数.数字最高位位于链表开始位置.它们的每个节点只存储单个数字.将这两数相加会返回一个 ...
- redis示例
1. 引入redis相关包 <!-- redis 相关包--> <dependency> <groupId>org.springframework.data< ...
- docker使用2
新建并启动容器 docker run [options] imageId options -i 以交互模式运行容器 -t 为容器重新分配一个为终端设备 -p 主机端口:容器端口 --nam ...
- Spring的并发问题——有状态Bean和无状态Bean
一.有状态和无状态 有状态会话bean :每个用户有自己特有的一个实例,在用户的生存期内,bean保持了用户的信息,即“有状态”:一旦用户灭亡(调用结束或实例结束),bean的生命期也告结束.即每 ...
- centos6.5升级openssh至7.9p1
环境说明系统环境:centos 6.5 x64 openssh-5.3p1升级原因:低版本openssh存在漏洞升级目标:openssh-7.9p1 检查环境官方文档中提到的先决条件openssh安装 ...