kafka 生产者发送消息

KafkaProducer 创建一个 KafkaThread 来运行 Sender.run 方法。

1. 发送消息的入口在 KafkaProducer#doSend 中，但其实是把消息加入到 batches 中：

kafka 生产者是按 batch 发送消息，RecordAccumulator 类有个变量 ConcurrentMap<TopicPartition, Deque<ProducerBatch>> batches，
KafkaProducer#doSend 方法会把当前的这条消息放入到 ProducerBatch 中。然后调用 Sender#wakeup 方法，尝试唤醒阻塞的 io 线程。

2. 从 batches 取出数据发送，入口在 Sender.run，主要的逻辑抽象为 3 步：

2.1 RecordAccumulator#drain 取出数据

// 每个分区只取一个 ProducerBatch

public Map<Integer, List<ProducerBatch>> drain(Cluster cluster,

                                               Set<Node> nodes,

                                               int maxSize,

                                               long now) {

    if (nodes.isEmpty())

        return Collections.emptyMap();

    Map<Integer, List<ProducerBatch>> batches = new HashMap<>();

    for (Node node : nodes) {

        int size = 0;

        // 取出该节点负责的分区

        List<PartitionInfo> parts = cluster.partitionsForNode(node.id());

        List<ProducerBatch> ready = new ArrayList<>();

        /* to make starvation less likely this loop doesn't start at 0 */

        int start = drainIndex = drainIndex % parts.size();

        // 遍历每个分区

        do {

            PartitionInfo part = parts.get(drainIndex);

            TopicPartition tp = new TopicPartition(part.topic(), part.partition());

            // Only proceed if the partition has no in-flight batches.

            if (!muted.contains(tp)) {

                Deque<ProducerBatch> deque = getDeque(tp);

                if (deque != null) {

                    synchronized (deque) {

                        ProducerBatch first = deque.peekFirst();

                        if (first != null) {

                            boolean backoff = first.attempts() > 0 && first.waitedTimeMs(now) < retryBackoffMs;

                            // Only drain the batch if it is not during backoff period.

                            if (!backoff) {

                                if (size + first.estimatedSizeInBytes() > maxSize && !ready.isEmpty()) {

                                    // there is a rare case that a single batch size is larger than the request size due

                                    // to compression; in this case we will still eventually send this batch in a single

                                    // request

                                    break;

                                } else {

                                    ProducerIdAndEpoch producerIdAndEpoch = null;

                                    boolean isTransactional = false;

                                    if (transactionManager != null) {

                                        if (!transactionManager.isSendToPartitionAllowed(tp))

                                            break;

                                        producerIdAndEpoch = transactionManager.producerIdAndEpoch();

                                        if (!producerIdAndEpoch.isValid())

                                            // we cannot send the batch until we have refreshed the producer id

                                            break;

                                        isTransactional = transactionManager.isTransactional();

                                        if (!first.hasSequence() && transactionManager.hasUnresolvedSequence(first.topicPartition))

                                            // Don't drain any new batches while the state of previous sequence numbers

                                            // is unknown. The previous batches would be unknown if they were aborted

                                            // on the client after being sent to the broker at least once.

                                            break;

                                        int firstInFlightSequence = transactionManager.firstInFlightSequence(first.topicPartition);

                                        if (firstInFlightSequence != RecordBatch.NO_SEQUENCE && first.hasSequence()

                                                && first.baseSequence() != firstInFlightSequence)

                                            // If the queued batch already has an assigned sequence, then it is being

                                            // retried. In this case, we wait until the next immediate batch is ready

                                            // and drain that. We only move on when the next in line batch is complete (either successfully

                                            // or due to a fatal broker error). This effectively reduces our

                                            // in flight request count to 1.

                                            break;

                                    }

                                    ProducerBatch batch = deque.pollFirst();

                                    if (producerIdAndEpoch != null && !batch.hasSequence()) {

                                        // If the batch already has an assigned sequence, then we should not change the producer id and

                                        // sequence number, since this may introduce duplicates. In particular,

                                        // the previous attempt may actually have been accepted, and if we change

                                        // the producer id and sequence here, this attempt will also be accepted,

                                        // causing a duplicate.

                                        //

                                        // Additionally, we update the next sequence number bound for the partition,

                                        // and also have the transaction manager track the batch so as to ensure

                                        // that sequence ordering is maintained even if we receive out of order

                                        // responses.

                                        batch.setProducerState(producerIdAndEpoch, transactionManager.sequenceNumber(batch.topicPartition), isTransactional);

                                        transactionManager.incrementSequenceNumber(batch.topicPartition, batch.recordCount);

                                        log.debug("Assigned producerId {} and producerEpoch {} to batch with base sequence " +

                                                        "{} being sent to partition {}", producerIdAndEpoch.producerId,

                                                producerIdAndEpoch.epoch, batch.baseSequence(), tp);

                                        transactionManager.addInFlightBatch(batch);

                                    }

                                    batch.close();

                                    size += batch.records().sizeInBytes();

                                    ready.add(batch);

                                    batch.drained(now);

                                }

                            }

                        }

                    }

                }

            }

            this.drainIndex = (this.drainIndex + 1) % parts.size();

        } while (start != drainIndex);

        batches.put(node.id(), ready);

    }

    return batches;

}

2.2 NetworkClient.send

这里的 send 不是真正的网络发送，先把 ProduceReuquest 序列化成 Send 对象，然后加入到 inFlightRequests 的头部，调用 selector 的 send，实则是 KafkaChannel.setSend()

Send send = request.toSend(nodeId, header);

this.inFlightRequests.add(inFlightRequest);

selector.send(inFlightRequest.send);

一个 NetworkSend 对象对应一个 ProduceRequest，包含一个或多个 ProducerBatch，也就是说一次网络会发送多个 batch，这也是 kafka 吞吐量大的原因之一。

2.3 NetworkClient.poll
真正的网络发送

Selector#pollSelectionKeys 处理网络读写事件，发送消息即写事件，同时把响应存放在 Selector#completedReceives 中
producer 发送消息，如果 acks = -1 和 1，即 producer 请求需要响应，
在 NetworkClient#handleCompletedSends 中，把不需要响应的请求，从 inFlightRequests 中删除
在 NetworkClient#handleCompletedReceives 处理响应
producer 设置了 ack 的值是固定的，producer 要么都需要响应，要么都不需要响应。
新的请求加在头部，收到的响应对应最旧的请求，即尾部的请求。

3. 主要的类
KafkaProducer: 直接暴露给用户的 api 类；Sender: 主要管理 ProducerBatch
NetworkClient: ProducerBatch 是对象，通过网络发送需要序列化，该类管理连接，更接近 io 层
Selector 对 java nio Selector 的封装
KafkaChannel

4. ByteBuffer

// ByteBuffer 的使用

// ByteBuffer 初始是写模式

public static void main(String[] args) throws UnsupportedEncodingException {

    // capacity = 512, limit = 512, position = 0

    ByteBuffer buffer = ByteBuffer.allocate(512);

    buffer.put((byte)'h');

    buffer.put((byte)'e');

    buffer.put((byte)'l');

    buffer.put((byte)'l');

    buffer.put((byte)'o');

    // limit = position, position = 0

    buffer.flip();

    // 获取字节数

    int len = buffer.remaining();

    byte[] dst = new byte[len];

    buffer.get(dst);

    System.out.println(new String(dst));

    // 结论：ByteBuffer 只是对 byte[] 的封装

}

//SocketChannel

//输出

//SocketChannel#write(java.nio.ByteBuffer)

//读取输入

//SocketChannel#read(java.nio.ByteBuffer)

kafka 生产者发送消息的更多相关文章

Kafka生产者发送消息的三种方式
Kafka是一种分布式的基于发布/订阅的消息系统,它的高吞吐量.灵活的offset是其它消息系统所没有的. Kafka发送消息主要有三种方式: 1.发送并忘记 2.同步发送 3.异步发送+回调函数下 ...
深入研究RocketMQ生产者发送消息的底层原理
前言 hello,小伙伴们,王子又来和大家研究RocketMQ的原理了,之前的文章RocketMQ生产部署架构如何设计中,我们已经简单的聊过了生产者是如何发送消息给Broker的. 我们简单回顾一下这 ...
kafka producer 发送消息简介
kafka 的 topic 由 partition 组成,producer 会根据 key,选择一个 partition 发送消息,而 partition 有多个副本,副本有 leader 和 fol ...
RocketMQ3.2.2生产者发送消息自动创建Topic队列数无法超过4个
问题现象 RocketMQ3.2.2版本,测试时尝试发送消息时自动创建Topic,设置了队列数量为8: producer.setDefaultTopicQueueNums(8); 同时设置broker ...
kafka producer发送消息 Failed to update metadata after问题
提示示例: ERROR Error when sending message to topic test with key: null, value: 2 bytes with error: Fail ...
Kafka 学习之路（三）—— Kafka生产者详解
一.生产者发送消息的过程首先介绍一下Kafka生产者发送消息的过程: Kafka会将发送消息包装为ProducerRecord对象, ProducerRecord对象包含了目标主题和要发送的内容,同 ...
Kafka 系列（三）—— Kafka 生产者详解
一.生产者发送消息的过程首先介绍一下 Kafka 生产者发送消息的过程: Kafka 会将发送消息包装为 ProducerRecord 对象, ProducerRecord 对象包含了目标主题和要发 ...
入门大数据---Kafka生产者详解
一.生产者发送消息的过程首先介绍一下 Kafka 生产者发送消息的过程: Kafka 会将发送消息包装为 ProducerRecord 对象, ProducerRecord 对象包含了目标主题和要发 ...
Kafka学习笔记（6）----Kafka使用Producer发送消息
1. Kafka的Producer 不论将kafka作为什么样的用途,都少不了的向Broker发送数据或接受数据,Producer就是用于向Kafka发送数据.如下: 2. 添加依赖 pom.xml文 ...

随机推荐

Springboot+Mybatis AOP注解动态切换数据源
在开发中因需求在项目中需要实现多数据源(虽然项目框架是SpringCloud,但是因其中只是单独的查询操作,觉得没必要开发一个项目,所以采用多数据源来进行实现) 1.在配置文件中创建多个数据连接配置 ...
Css网页样式设计
第一章概述一.CSS简介1.CSS是Cascading Style Sheets(层叠样式表单)的简称.通常所称的CSS是指CSS1,即层叠样式表单1级. 2.编辑CSS文档:与编辑HTML的方法 ...
第八讲 shiro 整合 ssm
1.整合ssm并且实现用户登录和菜单权限 2.将shiro整合到ssm中 (1)添加shiro相关jar包 (2)在web.xml中添加shiro配置  ...
PAT Advanced 1046 Shortest Distance (20 分) （知识点：贪心算法）
The task is really simple: given N exits on a highway which forms a simple cycle, you are supposed t ...
Flutter-常用插件庫
alibaba/flutter_boost:路由 install_plugin 2.0.0#app下载更新插件 audio_recorder: any #录音.播放 flutter_sound: ^1 ...
jQuery入门教程-CSS样式操作大全
1.获取样式 2.设置样式 3.追加样式 4.移除样式 5.重复切换anotherClass样式 6.判断是否含有某项样式 7.设置 CSS 属性参数描述 name 必需.规定 CSS 属性的名称 ...
Django【第10篇】：Django之分页初级版本
分页和中间件一.分页 Django的分页器(paginator) view.py from django.shortcuts import render,HttpResponse # Create ...
iOS各别版本new Date().getTime 获取时间戳为null问题
正常逻辑 new Date('2019-9-8').getTime() 注意日期格式 yyyy--mm-dd 因为yyyy/mm/dd也有兼容性问题但是各别iOS版本不支持 // IOS 获取时间戳 ...
eclipse把函数内容折叠的方法
eclipse 将方法折叠要先启动折叠功能启用方法:Ctrl+ / (小键盘) 或者:右键点击行号左边的空白,弹出的选项中,选择“Folding”下的“Enable Folding”这样启动foldi ...
python中strftime和strptime函数
strftime和strptime函数均来自包datetime from datetime import * strftime: 将datetime包中的datetime类,按照入参格式生成字符串变量 ...

kafka 生产者发送消息

kafka 生产者发送消息的更多相关文章

随机推荐

热门专题