Flink -- Keyed State
/* <pre>{@code
* DataStream<MyType> stream = ...;
* KeyedStream<MyType> keyedStream = stream.keyBy("id");
*
* keyedStream.map(new RichMapFunction<MyType, Tuple2<MyType, Long>>() {
*
* private ValueState<Long> count;
*
* public void open(Configuration cfg) {
* state = getRuntimeContext().getState(
* new ValueStateDescriptor<Long>("count", LongSerializer.INSTANCE, 0L));
* }
*
* public Tuple2<MyType, Long> map(MyType value) {
* long count = state.value() + 1;
* state.update(value);
* return new Tuple2<>(value, count);
* }
* });
* }</pre>
*/
在使用keyed state时,首先需要初始化,这里以ValueState为例子,
state = getRuntimeContext().getState(new ValueStateDescriptor<Long>("count", LongSerializer.INSTANCE, 0L));
1. 每个state需要一个标识,ValueStateDescriptor,包含唯一名字,Class,和default值
public ValueStateDescriptor(String name, Class<T> typeClass, T defaultValue)
2. getState,向stateBackend注册keyed state,
StreamingRuntimeContext
public <T> ValueState<T> getState(ValueStateDescriptor<T> stateProperties) {
KeyedStateStore keyedStateStore = checkPreconditionsAndGetKeyedStateStore(stateProperties);
stateProperties.initializeSerializerUnlessSet(getExecutionConfig());
return keyedStateStore.getState(stateProperties);
}
调用keyedStateStore.getState(stateProperties)
KeyedStateStore其实就是KeyedStateBackend的封装
public class DefaultKeyedStateStore implements KeyedStateStore {
private final KeyedStateBackend<?> keyedStateBackend;
private final ExecutionConfig executionConfig;
@Override
public <T> ValueState<T> getState(ValueStateDescriptor<T> stateProperties) {
try {
stateProperties.initializeSerializerUnlessSet(executionConfig);
return getPartitionedState(stateProperties);
} catch (Exception e) {
throw new RuntimeException("Error while getting state", e);
}
}
最终是调用到,keyedStateBackend
private <S extends State> S getPartitionedState(StateDescriptor<S, ?> stateDescriptor) throws Exception {
return keyedStateBackend.getPartitionedState(
VoidNamespace.INSTANCE,
VoidNamespaceSerializer.INSTANCE,
stateDescriptor);
}
AbstractKeyedStateBackend
public <N, S extends State> S getPartitionedState(
final N namespace,
final TypeSerializer<N> namespaceSerializer,
final StateDescriptor<S, ?> stateDescriptor) throws Exception { final S state = getOrCreateKeyedState(namespaceSerializer, stateDescriptor);
final InternalKvState<N> kvState = (InternalKvState<N>) state; return state;
}
getOrCreateKeyedState
public <N, S extends State, V> S getOrCreateKeyedState(
final TypeSerializer<N> namespaceSerializer,
StateDescriptor<S, V> stateDescriptor) throws Exception { InternalKvState<?> existing = keyValueStatesByName.get(stateDescriptor.getName());
if (existing != null) {
@SuppressWarnings("unchecked")
S typedState = (S) existing;
return typedState; //如果keyValueStatesByName有直接返回
} // create a new blank key/value state
S state = stateDescriptor.bind(new StateBinder() {
@Override
public <T> ValueState<T> createValueState(ValueStateDescriptor<T> stateDesc) throws Exception {
return AbstractKeyedStateBackend.this.createValueState(namespaceSerializer, stateDesc);
}
}); InternalKvState<N> kvState = (InternalKvState<N>) state;
keyValueStatesByName.put(stateDescriptor.getName(), kvState); //把新产生的state注册到keyValueStatesByName
3. ValueState读写,value,update
看下ValueState的定义,
HeapValueState
public class HeapValueState<K, N, V>
extends AbstractHeapState<K, N, V, ValueState<V>, ValueStateDescriptor<V>>
implements InternalValueState<N, V> { /**
* Creates a new key/value state for the given hash map of key/value pairs.
*
* @param stateDesc The state identifier for the state. This contains name
* and can create a default state value.
* @param stateTable The state tab;e to use in this kev/value state. May contain initial state.
*/
public HeapValueState(
ValueStateDescriptor<V> stateDesc,
StateTable<K, N, V> stateTable,
TypeSerializer<K> keySerializer,
TypeSerializer<N> namespaceSerializer) {
super(stateDesc, stateTable, keySerializer, namespaceSerializer);
} @Override
public V value() {
final V result = stateTable.get(currentNamespace); if (result == null) {
return stateDesc.getDefaultValue();
} return result;
} @Override
public void update(V value) { if (value == null) {
clear();
return;
} stateTable.put(currentNamespace, value);
}
}
都是通过StateTable,
CopyOnWriteStateTable
@Override
public S get(N namespace) {
return get(keyContext.getCurrentKey(), namespace);
} @Override
public boolean containsKey(N namespace) {
return containsKey(keyContext.getCurrentKey(), namespace);
} @Override
public void put(N namespace, S state) {
put(keyContext.getCurrentKey(), namespace, state);
}
可以看到value不光是记录一个value,而是记录key,namespace,value的关系
其中key是通过,keyContext.getCurrentKey()去到的
keyContext就是KeyedStateBackend
在StreamInputProcessor.processInput的时候,会通过
streamOperator.setKeyContextElement1(record);
把当前的key设置到KeyedStateBackend
这就是为何,对state的操作都是按key隔离开的
Flink -- Keyed State的更多相关文章
- Flink状态专题:keyed state和Operator state
众所周知,flink是有状态的计算.所以学习flink不可不知状态. 正好最近公司有个需求,要用到flink的状态计算,需求是这样的,收集数据库新增的数据. ...
- Flink之state processor api原理
无论您是在生产环境中运行Apache Flink or还是在过去将Flink评估为计算框架,您都可能会问自己一个问题:如何在Flink保存点中访问,写入或更新状态?不再询问!Apache Flink ...
- 从udaf谈flink的state
1.前言 本文主要基于实践过程中遇到的一系列问题,来详细说明Flink的状态后端是什么样的执行机制,以理解自定义函数应该怎么写比较合理,避免踩坑. 内容是基于Flink SQL的使用,主要说明自定义聚 ...
- Flink之state processor api实践
前不久,Flink社区发布了FLink 1.9版本,在其中包含了一个很重要的新特性,即state processor api,这个框架支持对checkpoint和savepoint进行操作,包括读取. ...
- 「Flink」使用Managed Keyed State实现计数窗口功能
先上代码: public class WordCountKeyedState { public static void main(String[] args) throws Exception { S ...
- Flink - Working with State
All transformations in Flink may look like functions (in the functional processing terminology), but ...
- Managing Large State in Apache Flink®: An Intro to Incremental Checkpointing
January 23, 2018- Apache Flink, Flink Features Stefan Richter and Chris Ward Apache Flink was purpos ...
- Flink学习(三)状态机制于容错机制,State与CheckPoint
摘自Apache官网 一.State的基本概念 什么叫State?搜了一把叫做状态机制.可以用作以下用途.为了保证 at least once, exactly once,Flink引入了State和 ...
- Flink中案例学习--State与CheckPoint理解
1.State概念理解 在Flink中,按照基本类型,对State做了以下两类的划分:Keyed State, Operator State. Keyed State:和Key有关的状态类型,它只能被 ...
随机推荐
- Asp.Net WebApi学习教程之增删改查
webapi简介 在asp.net中,创建一个HTTP服务,有很多方案,以前用ashx,一般处理程序(HttpHandler),现在可以用webapi 微软的web api是在vs2012上的mvc4 ...
- Android Launcher分析和修改11——自定义分页指示器(paged_view_indicator)
Android4.0的Launcher自带了一个简单的分页指示器,就是Hotseat上面那个线段,这个本质上是一个ImageView利用.9.png图片做,效果实在是不太美观,用测试人员的话,太丑了. ...
- 【九天教您南方cass 9.1】 03 编码法绘制地形图
同学们大家好,欢迎收看由老王测量上班记出品的cass9.1视频课程 我是本节课主讲老师九天. 测量空间的[九天教您南方cass]专栏是九天老师专门开设cass免费教学班.希望能帮助那些刚入行的同学,并 ...
- 【原】使用Json作为Python和C#混合编程时对象转换的中间文件
一.Python中自定义类对象json字符串化的步骤[1] 1. 用 json 或者simplejson 就可以: 2.定义转换函数: 3. 定义类 4. 生成对象 5.dumps执行,引入转换函 ...
- Spring Security登陆
本文参考或摘录自:http://haohaoxuexi.iteye.com/blog/2154714 在上一篇中使用Spring Security做了一些安全控制,如Spring Security 自 ...
- 还原Stack操作
下午看到一题.给定两个int[]数组,int[] org和int[] res, 分别代表一串数字,和这串数字经过stack的push 和 pop操作之后的数字,让返回一个String, String里 ...
- Asp.net Daily Build by MsBuild
:: 目录结构:: +GW.Point.BLL --dir dll:: +GW.Point.IBLL --dir dll:: +GW.Point.DAL --dir dll:: +GW.Point.I ...
- Swagger使用小记
Swagger是一种框架,用于自动生成Restfull API的文档,而不用开发者自己编写文档.它既可以减少我们创建文档的工作量,同时说明内容又整合入实现代码中,让维护文档和修改代码整合为一体,可以让 ...
- Cocos2d-x 3.0 纹理
1.纹理控制. Sprite *pSprite = Sprite::create("background.png"); TexParams params = {GL_NEAREST ...
- Wifi 开放系统认证和共享密钥身份认证
记录开放系统认证和共享密钥认证的区别. 开放系统身份认证(open-systern authentication) 是802.11 要求必备的惟一方式. 由行动式工作站所发出的第一个帧被归类为auth ...