flink1.7 checkpoint源码分析

初始化state类

//org.apache.flink.streaming.runtime.tasks.StreamTask#initializeState

initializeState();

private void initializeState() throws Exception {

StreamOperator<?>[] allOperators = operatorChain.getAllOperators();

for (StreamOperator<?> operator : allOperators) {

if (null != operator) {

operator.initializeState();

}

}

}

operator.initializeState() 调用的方法路径 org.apache.flink.streaming.api.operators.AbstractStreamOperator#initializeState() ，所有的操作流类都继承该类，同时也没有重写这个方法。

public final void initializeState() throws Exception {

////这里会调用状态后端，里面很重要

1. final StreamOperatorStateContext context =

streamTaskStateManager.streamOperatorStateContext(

getOperatorID(),

getClass().getSimpleName(),

this,

keySerializer,

streamTaskCloseableRegistry,

metrics);

...

|

streamTaskStateManager.streamOperatorStateContext(......)调用方法的路径org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl#streamOperatorStateContext

......

// -------------- Keyed State Backend 这里是重点关于checkpoint--------------

keyedStatedBackend = keyedStatedBackend(

keySerializer,

operatorIdentifierText,

prioritizedOperatorSubtaskStates,

streamTaskCloseableRegistry,

metricGroup);

// -------------- Operator State Backend 这里是重点关于checkpoint --------------

operatorStateBackend = operatorStateBackend(

operatorIdentifierText,

prioritizedOperatorSubtaskStates,

streamTaskCloseableRegistry);

......

keyedStatedBackend() 这个方法最里面是调用了 org.apache.flink.streaming.api.operators.BackendRestorerProcedure#attemptCreateAndRestore

private T attemptCreateAndRestore(Collection restoreState) throws Exception {

......

// create a new, empty backend.

final T backendInstance = instanceSupplier.get();

// attempt to restore from snapshot (or null if no state was checkpointed).

backendInstance.restore(restoreState);

......

}

backendInstance.restore(restoreState)调用的方法路径org.apache.flink.runtime.state.DefaultOperatorStateBackend#restore

// registeredOperatorStates这个对象是核心

...

PartitionableListState<?> listState = registeredOperatorStates.get(restoredSnapshot.getName());

if (null == listState) {

listState = new PartitionableListState<>(restoredMetaInfo);

//重点，这里就是存储了快照状态类

//********************************************************************

registeredOperatorStates.put(listState.getStateMetaInfo().getName(), listState);

//********************************************************************

} else {

// TODO with eager state registration in place, check here for serializer migration strategies

}

...

triggerCheckpoint 将定时触发执行checkpoint，而上面是是初始化的执行逻辑

定时快照state类

org.apache.flink.runtime.checkpoint.CheckpointCoordinator#triggerCheckpoint(long, boolean)

......

// send the messages to the tasks that trigger their checkpoint 我猜测这里就是远程发送触发checkpoint的步骤这里进行的数据文件的生成奶奶的

for (Execution execution: executions) {

execution.triggerCheckpoint(checkpointID, timestamp, checkpointOptions);

}

......

execution.triggerCheckpoint调用路径 org.apache.flink.runtime.executiongraph.Execution#triggerCheckpoint

/**

* Trigger a new checkpoint on the task of this execution.

* @param checkpointId of th checkpoint to trigger

* @param timestamp of the checkpoint to trigger

* @param checkpointOptions of the checkpoint to trigger

/

public void triggerCheckpoint(long checkpointId, long timestamp, CheckpointOptions checkpointOptions) {

......

final LogicalSlot slot = assignedResource;

if (slot != null) {

final TaskManagerGateway taskManagerGateway = slot.getTaskManagerGateway();

taskManagerGateway.triggerCheckpoint(attemptId, getVertex().getJobId(), checkpointId, timestamp, checkpointOptions);

}

.....

}

taskManagerGateway.triggerCheckpoint(......)里面最终调用路径 org.apache.flink.runtime.taskexecutor.TaskExecutor#triggerCheckpoint

@Override

public CompletableFuture triggerCheckpoint(

ExecutionAttemptID executionAttemptID,long checkpointId,long checkpointTimestamp,CheckpointOptions checkpointOptions) {

......

final Task task = taskSlotTable.getTask(executionAttemptID);

if (task != null) {

task.triggerCheckpointBarrier(checkpointId, checkpointTimestamp, checkpointOptions);

return CompletableFuture.completedFuture(Acknowledge.get());

}

......

}

task.triggerCheckpointBarrier(......)调用路径 org.apache.flink.runtime.taskmanager.Task#triggerCheckpointBarrier

/*

Calls the invokable to trigger a checkpoint.
这里开始出发执行checkpoint，应该算是入口了，会调用org.apache.flink.streaming.runtime.tasks.StreamTask#triggerCheckpoint
AsyncCheckpointRunnable 任务在里面被执行
@param checkpointID The ID identifying the checkpoint.
@param checkpointTimestamp The timestamp associated with the checkpoint.
@param checkpointOptions Options for performing this checkpoint.

*/

public void triggerCheckpointBarrier(

final long checkpointID,

long checkpointTimestamp,

final CheckpointOptions checkpointOptions) {

final AbstractInvokable invokable = this.invokable;

final CheckpointMetaData checkpointMetaData = new CheckpointMetaData(checkpointID, checkpointTimestamp);

if (executionState == ExecutionState.RUNNING && invokable != null) {

// build a local closure

final String taskName = taskNameWithSubtask;

final SafetyNetCloseableRegistry safetyNetCloseableRegistry =

FileSystemSafetyNet.getSafetyNetCloseableRegistryForThread();

Runnable runnable = new Runnable() {

@Override

public void run() {

// set safety net from the task's context for checkpointing thread

LOG.debug("Creating FileSystem stream leak safety net for {}", Thread.currentThread().getName());

FileSystemSafetyNet.setSafetyNetCloseableRegistryForThread(safetyNetCloseableRegistry);

try {

boolean success = invokable.triggerCheckpoint(checkpointMetaData, checkpointOptions);

......

}

......

}

};

//创建线程数为1的线程池，提交runnable任务运行

executeAsyncCallRunnable(runnable, String.format("Checkpoint Trigger for %s (%s).", taskNameWithSubtask, executionId));

}

}

invokable.triggerCheckpoint(.....)里面最终调用的方法链如下:

org.apache.flink.streaming.runtime.tasks.StreamTask#triggerCheckpoint

org.apache.flink.streaming.runtime.tasks.StreamTask#performCheckpoint

// we can do a checkpoint

// All of the following steps happen as an atomic step from the perspective of barriers and

// records/watermarks/timers/callbacks.

// We generally try to emit the checkpoint barrier as soon as possible to not affect downstream

// checkpoint alignments

// Step (1): Prepare the checkpoint, allow operators to do some pre-barrier work.

// The pre-barrier work should be nothing or minimal in the common case.

operatorChain.prepareSnapshotPreBarrier(checkpointMetaData.getCheckpointId());

// Step (2): Send the checkpoint barrier downstream 生成状态数据存储数据的对象为checkpointOptions 尼玛今天debug没有生成数据呦

operatorChain.broadcastCheckpointBarrier(

checkpointMetaData.getCheckpointId(),

checkpointMetaData.getTimestamp(),

checkpointOptions);

// Step (3): Take the state snapshot. This should be largely asynchronous, to not

// impact progress of the streaming topology

checkpointState(checkpointMetaData, checkpointOptions, checkpointMetrics);

checkpointState(......) 里面最终调用org.apache.flink.streaming.runtime.tasks.StreamTask.CheckpointingOperation#executeCheckpointing()

重点警戒线.....................................................

......

//调用用户的快照方法

for (StreamOperator<?> op : allOperators) {//不同的算子对应的子类不一样，

checkpointStreamOperator(op);

}

//后面生成数据，哪里生成数据了，要找到

//这个run任务好像只生成元数据

// we are transferring ownership over snapshotInProgressList for cleanup to the thread, active on submit

AsyncCheckpointRunnable asyncCheckpointRunnable = new AsyncCheckpointRunnable(

owner,

operatorSnapshotsInProgress,

checkpointMetaData,

checkpointMetrics,

startAsyncPartNano);

owner.cancelables.registerCloseable(asyncCheckpointRunnable);

owner.asyncOperationsThreadPool.submit(asyncCheckpointRunnable;

......

checkpointStreamOperator(op);

private void checkpointStreamOperator(StreamOperator<?> op) throws Exception {

if (null != op) {

//这个构造方法是核心

OperatorSnapshotFutures snapshotInProgress = op.snapshotState(

checkpointMetaData.getCheckpointId(),

checkpointMetaData.getTimestamp(),

checkpointOptions,

storageLocation);

operatorSnapshotsInProgress.put(op.getOperatorID(), snapshotInProgress);

}

}

op.snapshotState(）是核心，调用org.apache.flink.streaming.api.operators.AbstractStreamOperator#snapshotState(long, long, org.apache.flink.runtime.checkpoint.CheckpointOptions, org.apache.flink.runtime.state.CheckpointStreamFactory)

注意因为op是子类，有些累实现AbstractStreamOperator有些子类实现AbstractUdfStreamOperator，所以在下面调用snapshotState(snapshotContext)方法时，会根据子类的实现不同，调用org.apache.flink.streaming.api.operators.AbstractStreamOperator#snapshotState(org.apache.flink.runtime.state.StateSnapshotContext)

或org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator#snapshotState

AbstractStreamOperator 实现类有94个

AbstractUdfStreamOperator实现类有42个

AbstractUdfStreamOperator继承AbstractStreamOperator

@Override

public final OperatorSnapshotFutures snapshotState(long checkpointId, long timestamp, CheckpointOptions checkpointOptions,

CheckpointStreamFactory factory) throws Exception {

try (StateSnapshotContextSynchronousImpl snapshotContext = new StateSnapshotContextSynchronousImpl(

checkpointId,

timestamp,

factory,

keyGroupRange,

getContainingTask().getCancelables())) {

//继承AbstractUdfStreamOperator的操作类会调用用户的快照方法，继承AbstractStreamOperator的操作类会调用这个方法，但是这个方法没有做什么东西。

snapshotState(snapshotContext);

//上面调用好用户的快照方法了，就是确定了状态类里面目前的数据了。

//下面就是如何访问到状态类，讲状态内的数据写入磁盘了。

snapshotInProgress.setKeyedStateRawFuture(snapshotContext.getKeyedStateStreamFuture());

snapshotInProgress.setOperatorStateRawFuture(snapshotContext.getOperatorStateStreamFuture());

//这里是生产状态数据文件

if (null != operatorStateBackend) {

System.out.println(Thread.currentThread().getName()+"::这里将状态数据写入文件中");

snapshotInProgress.setOperatorStateManagedFuture(

operatorStateBackend.snapshot(checkpointId, timestamp, factory, checkpointOptions));

}

//这里是生产状态数据文件

if (null != keyedStateBackend) {

snapshotInProgress.setKeyedStateManagedFuture(

keyedStateBackend.snapshot(checkpointId, timestamp, factory, checkpointOptions));

}

}

return snapshotInProgress;

}

operatorStateBackend.snapshot(checkpointId, timestamp, factory, checkpointOptions))调用路径org.apache.flink.runtime.state.DefaultOperatorStateBackend#snapshot

谜底就在下面

public RunnableFuture<SnapshotResult> snapshot(

long checkpointId,

long timestamp,

@Nonnull CheckpointStreamFactory streamFactory,

@Nonnull CheckpointOptions checkpointOptions) throws Exception {

long syncStartTime = System.currentTimeMillis();

//这个是超级关键的地方，你想知道如何访问到用户函数中的状态类，就在这里。

RunnableFuture<SnapshotResult> snapshotRunner =

snapshotStrategy.snapshot(checkpointId, timestamp, streamFactory, checkpointOptions);

snapshotStrategy.logSyncCompleted(streamFactory, syncStartTime);

return snapshotRunner;

}

snapshotStrategy.snapshot(checkpointId, timestamp, streamFactory, checkpointOptions)调用路径，取决于用户指定的后端状态，默认调用路径如下org.apache.flink.runtime.state.DefaultOperatorStateBackend.DefaultOperatorStateBackendSnapshotStrategy#snapshot

DefaultOperatorStateBackendSnapshotStrategy 是DefaultOperatorStateBackend的内部类

public RunnableFuture<SnapshotResult> snapshot(......) throws IOException {

//貌似数据就存在 registeredOperatorStates对象里面其实下面的步骤不用研究，就是将状态数据写入文件，主要看看这个registeredOperatorStates是怎么弄到的

//************重点 registeredOperatorStates 对象

final Map<String, PartitionableListState<?>> registeredOperatorStatesDeepCopies =

new HashMap<>(registeredOperatorStates.size());

final Map<String, BackendWritableBroadcastState> registeredBroadcastStatesDeepCopies =

new HashMap<>(registeredBroadcastStates.size());

ClassLoader snapshotClassLoader = Thread.currentThread().getContextClassLoader();

try {

// eagerly create deep copies of the list and the broadcast states (if any)

// in the synchronous phase, so that we can use them in the async writing.

//entry.getValue() 里面就是状态类将状态类存储在新建的map对象中

if (!registeredOperatorStates.isEmpty()) {

for (Map.Entry<String, PartitionableListState<?>> entry : registeredOperatorStates.entrySet()) {

PartitionableListState<?> listState = entry.getValue();

if (null != listState) {

listState = listState.deepCopy();

}

registeredOperatorStatesDeepCopies.put(entry.getKey(), listState);

}

}

//广播状态

if (!registeredBroadcastStates.isEmpty()) {

for (Map.Entry<String, BackendWritableBroadcastState> entry : registeredBroadcastStates.entrySet()) {

BackendWritableBroadcastState broadcastState = entry.getValue();

if (null != broadcastState) {

broadcastState = broadcastState.deepCopy();

}

registeredBroadcastStatesDeepCopies.put(entry.getKey(), broadcastState);

}

}

}

        //这个方法里面生成了状态数据文件

        AsyncSnapshotCallable<SnapshotResult<OperatorStateHandle>> snapshotCallable =

            new AsyncSnapshotCallable<SnapshotResult<OperatorStateHandle>>() {

@Override

protected SnapshotResult callInternal() throws Exception {

......

// get the registered operator state infos ...

List operatorMetaInfoSnapshots =

new ArrayList<>(registeredOperatorStatesDeepCopies.size());

for (Map.Entry<String, PartitionableListState<?>> entry :

registeredOperatorStatesDeepCopies.entrySet()) {

operatorMetaInfoSnapshots.add(entry.getValue().getStateMetaInfo().snapshot());

}

// ... get the registered broadcast operator state infos ...

List broadcastMetaInfoSnapshots =

new ArrayList<>(registeredBroadcastStatesDeepCopies.size());

for (Map.Entry<String, BackendWritableBroadcastState> entry :

registeredBroadcastStatesDeepCopies.entrySet()) {

broadcastMetaInfoSnapshots.add(entry.getValue().getStateMetaInfo().snapshot());

}

// ... write them all in the checkpoint stream ...

DataOutputView dov = new DataOutputViewStreamWrapper(localOut);

OperatorBackendSerializationProxy backendSerializationProxy =

new OperatorBackendSerializationProxy(operatorMetaInfoSnapshots, broadcastMetaInfoSnapshots);

backendSerializationProxy.write(dov);

// ... and then go for the states ...

......

}

};

final FutureTask<SnapshotResult> task =

snapshotCallable.toAsyncSnapshotFutureTask(closeStreamOnCancelRegistry);

if (!asynchronousSnapshots) {

task.run();

}

return task;

}

}

从上面我们可以看到，状态类都存放在registeredOperatorStatesDeepCopies这个map中。

用户能够更新状态类的数据都是因为这样访问到了状态类

public void initializeState(FunctionInitializationContext context) throws Exception {

......

checkpointedState = context.getOperatorStateStore().getListState(descriptor);

......

}

调用的就是org.apache.flink.runtime.state.DefaultOperatorStateBackend#getListState(org.apache.flink.api.common.state.ListStateDescriptor)

/**

* @Description: 返回状态类的时候，将状态类放入map对象供后面写入文件中

* @Param:

* @return:

* @Author: intsmaze

* @Date: 2019/1/18

/

private ListState getListState(

ListStateDescriptor stateDescriptor,

OperatorStateHandle.Mode mode) throws StateMigrationException {

@SuppressWarnings("unchecked")

PartitionableListState previous = (PartitionableListState) accessedStatesByName.get(name);

if (previous != null) {

checkStateNameAndMode(

previous.getStateMetaInfo().getName(),

name,

previous.getStateMetaInfo().getAssignmentMode(),

mode);

return previous;

}

......

PartitionableListState partitionableListState = (PartitionableListState) registeredOperatorStates.get(name);

if (null == partitionableListState) {

// no restored state for the state name; simply create new state holder

partitionableListState = new PartitionableListState<>(

new RegisteredOperatorStateBackendMetaInfo<>(

name,

partitionStateSerializer,

mode));

//这里也会存储状态类数据registeredOperatorStates这个对象和DefaultOperatorStateBackendSnapshotStrategy类的快照方法访问的对象共享

//************************************************************

registeredOperatorStates.put(name, partitionableListState);

}

flink1.7 checkpoint源码分析的更多相关文章

flink checkpoint 源码分析（二）
转发请注明原创地址http://www.cnblogs.com/dongxiao-yang/p/8260370.html flink checkpoint 源码分析 (一)一文主要讲述了在JobMan ...
Flink源码阅读（二）——checkpoint源码分析
前言在Flink原理——容错机制一文中,已对checkpoint的机制有了较为基础的介绍,本文着重从源码方面去分析checkpoint的过程.当然本文只是分析做checkpoint的调度过程,只是尽 ...
flink-connector-kafka consumer checkpoint源码分析
转发请注明原创地址:http://www.cnblogs.com/dongxiao-yang/p/7700600.html <flink-connector-kafka consumer的top ...
flink checkpoint 源码分析（一）
转发请注明原创地址http://www.cnblogs.com/dongxiao-yang/p/8029356.html checkpoint是Flink Fault Tolerance机制的重要构成 ...
Heritrix源码分析(九) Heritrix的二次抓取以及如何让Heritrix抓取你不想抓取的URL
本博客属原创文章,欢迎转载!转载请务必注明出处:http://guoyunsky.iteye.com/blog/644396 本博客已迁移到本人独立博客: http://www.yun5u ...
Hadoop之HDFS原理及文件上传下载源码分析（上）
HDFS原理首先说明下,hadoop的各种搭建方式不再介绍,相信各位玩hadoop的同学随便都能搭出来. 楼主的环境: 操作系统:Ubuntu 15.10 hadoop版本:2.7.3 HA:否(随 ...
ElasticSearch Index操作源码分析
ElasticSearch Index操作源码分析本文记录ElasticSearch创建索引执行源码流程.从执行流程角度看一下创建索引会涉及到哪些服务(比如AllocationService.Mas ...
Mesos源码分析(12): Mesos-Slave接收到RunTask消息
在前文Mesos源码分析(8): Mesos-Slave的初始化中,Mesos-Slave接收到RunTaskMessage消息,会调用Slave::runTask. void Slave::ru ...
Spark 源码分析 -- task实际执行过程
Spark源码分析 – SparkContext 中的例子, 只分析到sc.runJob 那么最终是怎么执行的? 通过DAGScheduler切分成Stage, 封装成taskset, 提交给Task ...

随机推荐

Java:Hibernate报错记录：Error executing DDL via JDBC Statement
想着写一篇hibernate的博文,于是准备从头开始,从官网下了最新的稳定版本来做讲述. 结果利用hibernate自动建表的时候发生下面这个问题. 我很纳闷,之前用低版本一点的没有发生这个问题啊. ...
服务端如何获取客户端请求IP地址
服务端获取客户端请求IP地址,常见的包括:x-forwarded-for.client-ip等请求头,以及remote_addr参数. 一.remote_addr.x-forwarded-for.cl ...
Linux中Root密码破解
1.开机后在选择菜单时按下e进入编辑模式 2.选择linux16这一行,在行末尾添加 rd.break 3.然后Ctrl+x执行.然后进入shell界面: 4.设置密码: 1.重新挂载根目录为读写模式 ...
两个Map的对比，三种方法，将对比结果写入文件。
三种方法的思维都是遍历一个map的Key,然后2个Map分别取这2个Key值所得到的Value. #第一种用entry private void compareMap(Map<String, S ...
jQuery设置元素的readonly和disabled属性
jQuery的api中提供了对元素应用disabled和readonly属性的方法,如下: 1.readonly $('input').attr("readonly",&qu ...
puppet 横向扩展(一)
目录 1. 概述 2. 实验环境 3. 实验步骤 3.1. 创建puppetmaster的rack环境 3.2. 配置文件设置 3.3. 补充说明 3.4. 测试配置结果 3.4.1. 默认的负载均衡 ...
类中的 this关键字
this可用于区分局部变量和成员变量,因为构造函数中如果使用 this.成员变量 = 参数值, 那么可以在new对象时,将初始化值赋值给成员变量,否则成员变量赋值失败, 所以this可以区分成员变量和 ...
025k个一组翻转链表
#include "000库函数.h" struct ListNode { int val; ListNode *next; ListNode(int x) : val(x), n ...
SpringCloud之初识Zuul(网关)---动态路由,权限验证
通过前面的学习,使用Spring Cloud实现微服务的架构基本成型,大致是这样的: 我们使用Spring Cloud Netflix中的Eureka实现了服务注册中心以及服务注册与发现:而服务间通过 ...
6.context对象

flink1.7 checkpoint源码分析

flink1.7 checkpoint源码分析的更多相关文章

随机推荐

热门专题