Flink 中的kafka何时commit?
https://ci.apache.org/projects/flink/flink-docs-release-1.6/internals/stream_checkpointing.html
@Override
publicfinalvoidnotifyCheckpointComplete(longcheckpointId)throwsException{
if(!running){
LOG.debug("notifyCheckpointComplete()calledonclosedsource");
return;
}
finalAbstractFetcher<?,?>fetcher=this.kafkaFetcher;
if(fetcher==null){
LOG.debug("notifyCheckpointComplete()calledonuninitializedsource");
return;
}
if(offsetCommitMode==OffsetCommitMode.ON_CHECKPOINTS){
//onlyonecommitoperationmustbeinprogress
if(LOG.isDebugEnabled()){
LOG.debug("CommittingoffsetstoKafka/ZooKeeperforcheckpoint"+checkpointId);
}
try{
finalintposInMap=pendingOffsetsToCommit.indexOf(checkpointId);
if(posInMap==-1){
LOG.warn("Receivedconfirmationforunknowncheckpointid{}",checkpointId);
return;
}
@SuppressWarnings("unchecked")
Map<KafkaTopicPartition,Long>offsets=
(Map<KafkaTopicPartition,Long>)pendingOffsetsToCommit.remove(posInMap);
//removeoldercheckpointsinmap
for(inti=0;i<posInMap;i++){
pendingOffsetsToCommit.remove(0);
}
if(offsets==null||offsets.size()==0){
LOG.debug("Checkpointstatewasempty.");
return;
}
fetcher.commitInternalOffsetsToKafka(offsets,offsetCommitCallback);
}catch(Exceptione){
if(running){
throwe;
}
//elseignoreexceptionifwearenolongerrunning
}
}
}
/**
*Theoffsetcommitmoderepresentsthebehaviourofhowoffsetsareexternallycommitted
*backtoKafkabrokers/Zookeeper.
*
*<p>Theexactvalueofthisisdeterminedatruntimeintheconsumersubtasks.
*/
@Internal
publicenumOffsetCommitMode{
/**Completelydisableoffsetcommitting.*/
DISABLED,
/**CommitoffsetsbacktoKafkaonlywhencheckpointsarecompleted.*/
ON_CHECKPOINTS,
/**CommitoffsetsperiodicallybacktoKafka,usingtheautocommitfunctionalityofinternalKafkaclients.*/
KAFKA_PERIODIC;
}
/**
*CommitsthegivenpartitionoffsetstotheKafkabrokers(ortoZooKeeperfor
*olderKafkaversions).Thismethodisonlyevercalledwhentheoffsetcommitmodeof
*theconsumeris{@linkOffsetCommitMode#ON_CHECKPOINTS}.
*
*<p>Thegivenoffsetsaretheinternalcheckpointedoffsets,representing
*thelastprocessedrecordofeachpartition.Version-specificimplementationsofthismethod
*needtoholdthecontractthatthegivenoffsetsmustbeincrementedby1before
*committingthem,sothatcommittedoffsetstoKafkarepresent"thenextrecordtoprocess".
*
*@paramoffsetsTheoffsetstocommittoKafka(implementationsmustincrementoffsetsby1beforecommitting).
*@paramcommitCallbackThecallbackthattheusershouldtriggerwhenacommitrequestcompletesorfails.
*@throwsExceptionThismethodforwardsexceptions.
*/
publicfinalvoidcommitInternalOffsetsToKafka(
Map<KafkaTopicPartition,Long>offsets,
@NonnullKafkaCommitCallbackcommitCallback)throwsException{
//Ignoresentinels.Theymightappearhereifsnapshothasstartedbeforeactualoffsetsvalues
//replacedsentinels
doCommitInternalOffsetsToKafka(filterOutSentinels(offsets),commitCallback);
}
/**
* Invoking this method makes all buffered records immediately available to send (even if <code>linger.ms</code> is
* greater than 0) and blocks on the completion of the requests associated with these records. The post-condition
* of <code>flush()</code> is that any previously sent record will have completed (e.g. <code>Future.isDone() == true</code>).
* A request is considered completed when it is successfully acknowledged
* according to the <code>acks</code> configuration you have specified or else it results in an error.
* <p>
* Other threads can continue sending records while one thread is blocked waiting for a flush call to complete,
* however no guarantee is made about the completion of records sent after the flush call begins.
* <p>
* This method can be useful when consuming from some input system and producing into Kafka. The <code>flush()</code> call
* gives a convenient way to ensure all previously sent messages have actually completed.
* <p>
* This example shows how to consume from one Kafka topic and produce to another Kafka topic:
* <pre>
* {@code
* for(ConsumerRecord<String, String> record: consumer.poll(100))
* producer.send(new ProducerRecord("my-topic", record.key(), record.value());
* producer.flush();
* consumer.commit();
* }
* </pre>
*
* Note that the above example may drop records if the produce request fails. If we want to ensure that this does not occur
* we need to set <code>retries=<large_number></code> in our config.
* </p>
* <p>
* Applications don't need to call this method for transactional producers, since the {@link #commitTransaction()} will
* flush all buffered records before performing the commit. This ensures that all the the {@link #send(ProducerRecord)}
* calls made since the previous {@link #beginTransaction()} are completed before the commit.
* </p>
*
* @throws InterruptException If the thread is interrupted while blocked
*/
@Override
public void flush() {
log.trace("Flushing accumulated records in producer.");
this.accumulator.beginFlush();
this.sender.wakeup();
try {
this.accumulator.awaitFlushCompletion();
} catch (InterruptedException e) {
throw new InterruptException("Flush interrupted.", e);
}
}
Flink 中的kafka何时commit?的更多相关文章
- 在flink中使用jackson JSONKeyValueDeserializationSchema反序列化Kafka消息报错解决
在做支付订单宽表的场景,需要关联的表比较多而且支付有可能要延迟很久,这种情况下不太适合使用Flink的表Join,想到的另外一种解决方案是消费多个Topic的数据,再根据订单号进行keyBy,再在逻辑 ...
- flink⼿手动维护kafka偏移量量
flink对接kafka,官方模式方式是自动维护偏移量 但并没有考虑到flink消费kafka过程中,如果出现进程中断后的事情! 如果此时,进程中段: 1:数据可能丢失 从获取了了数据,但是在执⾏行行 ...
- Flink中的Time
戳更多文章: 1-Flink入门 2-本地环境搭建&构建第一个Flink应用 3-DataSet API 4-DataSteam API 5-集群部署 6-分布式缓存 7-重启策略 8-Fli ...
- spark streaming中维护kafka偏移量到外部介质
spark streaming中维护kafka偏移量到外部介质 以kafka偏移量维护到redis为例. redis存储格式 使用的数据结构为string,其中key为topic:partition, ...
- Apache Flink中的广播状态实用指南
感谢英文原文作者:https://data-artisans.com/blog/a-practical-guide-to-broadcast-state-in-apache-flink 不过,原文最近 ...
- Flink学习(二)Flink中的时间
摘自Apache Flink官网 最早的streaming 架构是storm的lambda架构 分为三个layer batch layer serving layer speed layer 一.在s ...
- 《从0到1学习Flink》—— Flink 中几种 Time 详解
前言 Flink 在流程序中支持不同的 Time 概念,就比如有 Processing Time.Event Time 和 Ingestion Time. 下面我们一起来看看这几个 Time: Pro ...
- 《从0到1学习Flink》—— 介绍Flink中的Stream Windows
前言 目前有许多数据分析的场景从批处理到流处理的演变, 虽然可以将批处理作为流处理的特殊情况来处理,但是分析无穷集的流数据通常需要思维方式的转变并且具有其自己的术语(例如,"windowin ...
- Flink 从0到1学习 —— Flink 中如何管理配置?
前言 如果你了解 Apache Flink 的话,那么你应该熟悉该如何像 Flink 发送数据或者如何从 Flink 获取数据.但是在某些情况下,我们需要将配置数据发送到 Flink 集群并从中接收一 ...
随机推荐
- java第三节 面向对象(上)
//第三讲 //面向对象(上) /* 理解面向对象的概念 面向过程 在一个结构体中定义窗体的大小,位置,颜色,背景等属性,对窗口操作的函数窗口本身的定义没有任何关系 如HideWindow, Move ...
- centos6.5关闭ipv6
万境归空,道法自然 1.在/etc/modprobe.d/目录下增加一个新的配置文件ipv6.conf cat << EOF > /etc/modprobe.d/ipv6.confa ...
- 【highstock】按时间(zoom)让它去访问服务器呢?
$(function () { /** * Load new data depending on the selected min and max */ function afterSetExtrem ...
- ajax done和always区别
jQuery中Ajax有done和always这两个回调方法:done:成功时执行,异常时不会执行.always:不论成功与否都会执行.
- ubuntu下安装迅雷thunder
迅雷是windows xp下必装的下载工具,作为一款跨协议的下载软件,迅雷的下载速度极其强悍. 那么在ubuntu下能否安装迅雷呢? 到ubuntu中文论坛逛了一圈,发现有现成的wine-thunde ...
- 使用T-SQL语句操作视图
转自:使用T-SQL语句操作视图 提示:只能查看,删除,创建视图,不能对数据进行增,删,改操作. use StuManageDB go --判断视图是否存在 if exists(Select * fr ...
- 给我一对公钥和私钥,我就能破解此RSA
RSA密码系统如果暴露了一套公钥和私钥,那么这套密码系统就全部失效了.因为根据公钥和私钥可以完成大整数的分解.暴露了两个质数. 记公钥为e,私钥为d,因为ed%phi=1,所以就得到了一个k=ed-1 ...
- JavaScript 浏览器对象模型 (BOM)
浏览器对象模型 (BOM) 使 JavaScript 有能力与浏览器“对话”. 浏览器对象模型 (BOM) 浏览器对象模型(Browser Object Model)尚无正式标准. 由于现代浏览器已经 ...
- window 10 企业版激活
一. 用管理员权限打开CMD.EXE 接着输入以下命令: slmgr /ipk NPPR9-FWDCX-D2C8J-H872K-2YT43 弹出窗口提示:“成功的安装了产品密钥”. 继续输入以下命令: ...
- OpenCV 学习笔记03 threshold函数
opencv-python 4.0.1 简介:该函数是对数组中的每一个元素(each array element)应用固定级别阈值(Applies a fixed-level threshold) ...