Flink 中的kafka何时commit?
https://ci.apache.org/projects/flink/flink-docs-release-1.6/internals/stream_checkpointing.html
@Override
publicfinalvoidnotifyCheckpointComplete(longcheckpointId)throwsException{
if(!running){
LOG.debug("notifyCheckpointComplete()calledonclosedsource");
return;
}
finalAbstractFetcher<?,?>fetcher=this.kafkaFetcher;
if(fetcher==null){
LOG.debug("notifyCheckpointComplete()calledonuninitializedsource");
return;
}
if(offsetCommitMode==OffsetCommitMode.ON_CHECKPOINTS){
//onlyonecommitoperationmustbeinprogress
if(LOG.isDebugEnabled()){
LOG.debug("CommittingoffsetstoKafka/ZooKeeperforcheckpoint"+checkpointId);
}
try{
finalintposInMap=pendingOffsetsToCommit.indexOf(checkpointId);
if(posInMap==-1){
LOG.warn("Receivedconfirmationforunknowncheckpointid{}",checkpointId);
return;
}
@SuppressWarnings("unchecked")
Map<KafkaTopicPartition,Long>offsets=
(Map<KafkaTopicPartition,Long>)pendingOffsetsToCommit.remove(posInMap);
//removeoldercheckpointsinmap
for(inti=0;i<posInMap;i++){
pendingOffsetsToCommit.remove(0);
}
if(offsets==null||offsets.size()==0){
LOG.debug("Checkpointstatewasempty.");
return;
}
fetcher.commitInternalOffsetsToKafka(offsets,offsetCommitCallback);
}catch(Exceptione){
if(running){
throwe;
}
//elseignoreexceptionifwearenolongerrunning
}
}
}
/**
*Theoffsetcommitmoderepresentsthebehaviourofhowoffsetsareexternallycommitted
*backtoKafkabrokers/Zookeeper.
*
*<p>Theexactvalueofthisisdeterminedatruntimeintheconsumersubtasks.
*/
@Internal
publicenumOffsetCommitMode{
/**Completelydisableoffsetcommitting.*/
DISABLED,
/**CommitoffsetsbacktoKafkaonlywhencheckpointsarecompleted.*/
ON_CHECKPOINTS,
/**CommitoffsetsperiodicallybacktoKafka,usingtheautocommitfunctionalityofinternalKafkaclients.*/
KAFKA_PERIODIC;
}
/**
*CommitsthegivenpartitionoffsetstotheKafkabrokers(ortoZooKeeperfor
*olderKafkaversions).Thismethodisonlyevercalledwhentheoffsetcommitmodeof
*theconsumeris{@linkOffsetCommitMode#ON_CHECKPOINTS}.
*
*<p>Thegivenoffsetsaretheinternalcheckpointedoffsets,representing
*thelastprocessedrecordofeachpartition.Version-specificimplementationsofthismethod
*needtoholdthecontractthatthegivenoffsetsmustbeincrementedby1before
*committingthem,sothatcommittedoffsetstoKafkarepresent"thenextrecordtoprocess".
*
*@paramoffsetsTheoffsetstocommittoKafka(implementationsmustincrementoffsetsby1beforecommitting).
*@paramcommitCallbackThecallbackthattheusershouldtriggerwhenacommitrequestcompletesorfails.
*@throwsExceptionThismethodforwardsexceptions.
*/
publicfinalvoidcommitInternalOffsetsToKafka(
Map<KafkaTopicPartition,Long>offsets,
@NonnullKafkaCommitCallbackcommitCallback)throwsException{
//Ignoresentinels.Theymightappearhereifsnapshothasstartedbeforeactualoffsetsvalues
//replacedsentinels
doCommitInternalOffsetsToKafka(filterOutSentinels(offsets),commitCallback);
}
/**
* Invoking this method makes all buffered records immediately available to send (even if <code>linger.ms</code> is
* greater than 0) and blocks on the completion of the requests associated with these records. The post-condition
* of <code>flush()</code> is that any previously sent record will have completed (e.g. <code>Future.isDone() == true</code>).
* A request is considered completed when it is successfully acknowledged
* according to the <code>acks</code> configuration you have specified or else it results in an error.
* <p>
* Other threads can continue sending records while one thread is blocked waiting for a flush call to complete,
* however no guarantee is made about the completion of records sent after the flush call begins.
* <p>
* This method can be useful when consuming from some input system and producing into Kafka. The <code>flush()</code> call
* gives a convenient way to ensure all previously sent messages have actually completed.
* <p>
* This example shows how to consume from one Kafka topic and produce to another Kafka topic:
* <pre>
* {@code
* for(ConsumerRecord<String, String> record: consumer.poll(100))
* producer.send(new ProducerRecord("my-topic", record.key(), record.value());
* producer.flush();
* consumer.commit();
* }
* </pre>
*
* Note that the above example may drop records if the produce request fails. If we want to ensure that this does not occur
* we need to set <code>retries=<large_number></code> in our config.
* </p>
* <p>
* Applications don't need to call this method for transactional producers, since the {@link #commitTransaction()} will
* flush all buffered records before performing the commit. This ensures that all the the {@link #send(ProducerRecord)}
* calls made since the previous {@link #beginTransaction()} are completed before the commit.
* </p>
*
* @throws InterruptException If the thread is interrupted while blocked
*/
@Override
public void flush() {
log.trace("Flushing accumulated records in producer.");
this.accumulator.beginFlush();
this.sender.wakeup();
try {
this.accumulator.awaitFlushCompletion();
} catch (InterruptedException e) {
throw new InterruptException("Flush interrupted.", e);
}
}
Flink 中的kafka何时commit?的更多相关文章
- 在flink中使用jackson JSONKeyValueDeserializationSchema反序列化Kafka消息报错解决
在做支付订单宽表的场景,需要关联的表比较多而且支付有可能要延迟很久,这种情况下不太适合使用Flink的表Join,想到的另外一种解决方案是消费多个Topic的数据,再根据订单号进行keyBy,再在逻辑 ...
- flink⼿手动维护kafka偏移量量
flink对接kafka,官方模式方式是自动维护偏移量 但并没有考虑到flink消费kafka过程中,如果出现进程中断后的事情! 如果此时,进程中段: 1:数据可能丢失 从获取了了数据,但是在执⾏行行 ...
- Flink中的Time
戳更多文章: 1-Flink入门 2-本地环境搭建&构建第一个Flink应用 3-DataSet API 4-DataSteam API 5-集群部署 6-分布式缓存 7-重启策略 8-Fli ...
- spark streaming中维护kafka偏移量到外部介质
spark streaming中维护kafka偏移量到外部介质 以kafka偏移量维护到redis为例. redis存储格式 使用的数据结构为string,其中key为topic:partition, ...
- Apache Flink中的广播状态实用指南
感谢英文原文作者:https://data-artisans.com/blog/a-practical-guide-to-broadcast-state-in-apache-flink 不过,原文最近 ...
- Flink学习(二)Flink中的时间
摘自Apache Flink官网 最早的streaming 架构是storm的lambda架构 分为三个layer batch layer serving layer speed layer 一.在s ...
- 《从0到1学习Flink》—— Flink 中几种 Time 详解
前言 Flink 在流程序中支持不同的 Time 概念,就比如有 Processing Time.Event Time 和 Ingestion Time. 下面我们一起来看看这几个 Time: Pro ...
- 《从0到1学习Flink》—— 介绍Flink中的Stream Windows
前言 目前有许多数据分析的场景从批处理到流处理的演变, 虽然可以将批处理作为流处理的特殊情况来处理,但是分析无穷集的流数据通常需要思维方式的转变并且具有其自己的术语(例如,"windowin ...
- Flink 从0到1学习 —— Flink 中如何管理配置?
前言 如果你了解 Apache Flink 的话,那么你应该熟悉该如何像 Flink 发送数据或者如何从 Flink 获取数据.但是在某些情况下,我们需要将配置数据发送到 Flink 集群并从中接收一 ...
随机推荐
- 早前阅读live555源码做的笔记
早前阅读live555源码的时候做了一些简单的笔记.现在看来那个时候对C++的理解还是不够,还有很多不足.当时对很多名词也不是很熟悉,对一些类的描述也很生硬,所以笔记中有一些不通畅之处. 阅读live ...
- DLib Http Server程序示例
/* 这个示例是一个使用了Dlib C++ 库的server组件的HTTP扩展 它创建一个始终以简单的HTML表单为响应的服务器. 要查看这个页面,你应该访问 http://localhost:500 ...
- 使用google chart api生成报表图片
使用google chart api生成报表图片 截图 折线图 饼图 柱状图 实现方法 原理是调用google的报表服务,动态拼接url字符串,得到一张图片,数据和说明文字都是从url中传进去的. ...
- (原)PyTorch中使用指定的GPU
转载请注明出处: http://www.cnblogs.com/darkknightzh/p/6836568.html PyTorch默认使用从0开始的GPU,如果GPU0正在运行程序,需要指定其他G ...
- 得到类所在的jar包路径
//理论上用class.getProtectionDomain().getCodeSource().getLocation().getFile();比较准. //不过有两个需要注意的: //1.返 ...
- 如何找出阻止windows睡眠的原因或软件
1.开始菜单 2.搜索程序和文件里输入 CMD 3.cmd.exe上右键点击以管理员权限运行 4.在cmd黑屏窗口里输入 powercfg -requests如下图所示
- IKE 协议(转)
from: http://lulu1101.blog.51cto.com/4455468/817872 IKE 协议 2012-03-26 21:49:50 标签:休闲 ike 职场 IKE 协议简介 ...
- 进阶之路(基础篇) - 008 SPI数据传输(库函数方法)
主机端: /********************************* 代码功能:SPI数据传输(主机端) 引脚说明: SS/CS:片选(高电平屏蔽,低电平启用) MOSI :主机送出信号 M ...
- C语言连接MySQL数据库(转)
c++连接MySQL有两种方式,1是原始的方法,2是用 Connector c++ .Connector c++ 只是一种封装,使之更加方便. 1.原始方法 这里归纳了C API可使用的函数,并在下 ...
- 为什么你学不会递归?告别递归,谈谈我的一些经验 关于集合中一些常考的知识点总结 .net辗转java系列(一)视野 彻底理解cookie,session,token
为什么你学不会递归?告别递归,谈谈我的一些经验 可能很多人在大一的时候,就已经接触了递归了,不过,我敢保证很多人初学者刚开始接触递归的时候,是一脸懵逼的,我当初也是,给我的感觉就是,递归太神奇了! ...