一、doSend()方法

Kafka中的每一条消息都对应一个ProducerRecord对象。


public class ProducerRecord<K, V> { private final String topic;
private final Integer partition;
private final Headers headers;
private final K key;
private final V value;
private final Long timestamp; }

doSend() 源码如下:


private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
TopicPartition tp = null;
try {
throwIfProducerClosed();
// first make sure the metadata for the topic is available
ClusterAndWaitTime clusterAndWaitTime; // 1、获取集群的元数据
try {
clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
} catch (KafkaException e) {
if (metadata.isClosed())
throw new KafkaException("Producer closed while send in progress", e);
throw e;
}
long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
Cluster cluster = clusterAndWaitTime.cluster; //2、 Key、Value 序列化
byte[] serializedKey;
try {
serializedKey = keySerializer.serialize(record.topic(), record.headers(), record.key());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
" specified in key.serializer", cce);
}
byte[] serializedValue;
try {
serializedValue = valueSerializer.serialize(record.topic(), record.headers(), record.value());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
" specified in value.serializer", cce);
} // 3、分区器
int partition = partition(record, serializedKey, serializedValue, cluster);
// 对topic和分区进行封装,封装后的结果 topic-partition ,[return topic + "-" + partition;]
tp = new TopicPartition(record.topic(), partition); setReadOnly(record.headers());
Header[] headers = record.headers().toArray(); int serializedSize = AbstractRecords.estimateSizeInBytesUpperBound(apiVersions.maxUsableProduceMagic(),
compressionType, serializedKey, serializedValue, headers);
ensureValidRecordSize(serializedSize);
long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
// producer callback will make sure to call both 'callback' and interceptor callback
Callback interceptCallback = new InterceptorCallback<>(callback, this.interceptors, tp); if (transactionManager != null && transactionManager.isTransactional())
transactionManager.maybeAddPartitionToTransaction(tp); // 4、将数据放到accumulator中
RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey,
serializedValue, headers, interceptCallback, remainingWaitMs); // 5、如果batch已经满了,唤醒 Sender线程发送数据
if (result.batchIsFull || result.newBatchCreated) {
log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
this.sender.wakeup();
} return result.future; } catch (Exception e) {
// ...
}
}

获取集群元数据

doSend()中相关的源码


try {
clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
} catch (KafkaException e) {
if (metadata.isClosed())
throw new KafkaException("Producer closed while send in progress", e);
throw e;
}
long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
Cluster cluster = clusterAndWaitTime.cluster;

Cluster类定义

public final class Cluster {

    private final boolean isBootstrapConfigured;
// broker 节点列表
private final List<Node> nodes;
private final Set<String> unauthorizedTopics;
private final Set<String> invalidTopics;
private final Set<String> internalTopics; // controller所在节点
private final Node controller; // 记录TopicPartition和PartitionInfo的管理
// TopicPartition中定义了partition、topic两个字段
// PartitionInfo记录详细的partition信息,定义了leader节点的位置、以Node数组的形式定义了 副本节点的位置、ISR、offline副本
private final Map<TopicPartition, PartitionInfo> partitionsByTopicPartition; //Topic 与 分区 的映射关系
private final Map<String, List<PartitionInfo>> partitionsByTopic;
private final Map<String, List<PartitionInfo>> availablePartitionsByTopic;
private final Map<Integer, List<PartitionInfo>> partitionsByNode;
private final Map<Integer, Node> nodesById;
private final ClusterResource clusterResource; }

调试获得clusterAndWaitTime对象的显示结果

二、partition()方法

当Record没有被指定分区时,会调用partiton方法,分配到分区。

传入的参数是,record的topic、k-v序列化、以及集群信息。


public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) { // 从集群中获取Topic对应的分区信息
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size(); //判断key是否为空
if (keyBytes == null) {
int nextValue = nextValue(topic);
List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
if (availablePartitions.size() > 0) {
int part = Utils.toPositive(nextValue) % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
// no partitions are available, give a non-available partition
return Utils.toPositive(nextValue) % numPartitions;
}
} else {
// 若key不为空,则对key进行hash,然后对分区数取余
// hash the keyBytes to choose a partition
return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
}

三、append()方法

在调用KafkaProducer构造方法的时候,会创建Accumulator对象。

RecordAccumulator

RecordAccumulator主要用来缓存消息以便Sender线程可以批量发送数据,减少网络传输的资源消耗,以提升性能。

RecordAccumulator内部每个分区都维护一个Deque,其中ProducerBatch包含一至多个ProducerRecord。

消息写入缓存(RecordAccumulator)时,消息(ProducerRecord)会追加到双端队列(Deque)尾部,追加的过程是,找到分区对应的Deque,然后取出Deque最后一个ProducerBatch,看能否追加,若追加不了,则新建一个ProducerBatch,将消息追加到ProducerBatch中。ProducerBatch的大小,由参数batch.size来确定。

Sender读取消息时,会从双端队列头部读取消息。

RecordAccumulator类定义


public final class RecordAccumulator { private final Logger log;
private volatile boolean closed;
private final AtomicInteger flushesInProgress;
private final AtomicInteger appendsInProgress;
private final int batchSize;
private final CompressionType compression;
private final int lingerMs;
private final long retryBackoffMs;
private final int deliveryTimeoutMs;
private final BufferPool free;
private final Time time;
private final ApiVersions apiVersions; // 在RecordAccumulator的构造方法中实例化 this.batches = new CopyOnWriteMap<>();
private final ConcurrentMap<TopicPartition, Deque<ProducerBatch>> batches; private final IncompleteBatches incomplete; // The following variables are only accessed by the sender thread, so we don't need to protect them.
private final Map<TopicPartition, Long> muted;
private int drainIndex;
private final TransactionManager transactionManager;
private long nextBatchExpiryTimeMs = Long.MAX_VALUE; }

调试结果

append()方法



public RecordAppendResult append(TopicPartition tp,
long timestamp,
byte[] key,
byte[] value,
Header[] headers,
Callback callback,
long maxTimeToBlock) throws InterruptedException {
// We keep track of the number of appending thread to make sure we do not miss batches in
// abortIncompleteBatches().
appendsInProgress.incrementAndGet();
ByteBuffer buffer = null;
if (headers == null) headers = Record.EMPTY_HEADERS;
try { // check if we have an in-progress batch
// 从上文中我们知道batches的结构是ConcurrentMap<TopicPartition, Deque<ProducerBatch>>
// 从batches中检查是否存在这个tp,如果有则返回Deque,若没有则创建
// 这里的Deque是ArrayDeque
Deque<ProducerBatch> dq = getOrCreateDeque(tp); synchronized (dq) {
if (closed)
throw new KafkaException("Producer closed while send in progress");
RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
if (appendResult != null)
return appendResult;
} // 下面一大步是将record中的信息转成buffer对象,进而封装成ProducerBatch对象,再放到dq中。
// we don't have an in-progress record batch try to allocate a new batch
byte maxUsableMagic = apiVersions.maxUsableProduceMagic();
int size = Math.max(this.batchSize, AbstractRecords.estimateSizeInBytesUpperBound(maxUsableMagic, compression, key, value, headers));
log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(), tp.partition());
buffer = free.allocate(size, maxTimeToBlock);
synchronized (dq) {
// Need to check if producer is closed again after grabbing the dequeue lock.
if (closed)
throw new KafkaException("Producer closed while send in progress"); RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
if (appendResult != null) {
// Somebody else found us a batch, return the one we waited for! Hopefully this doesn't happen often...
return appendResult;
} MemoryRecordsBuilder recordsBuilder = recordsBuilder(buffer, maxUsableMagic);
ProducerBatch batch = new ProducerBatch(tp, recordsBuilder, time.milliseconds());
FutureRecordMetadata future = Utils.notNull(batch.tryAppend(timestamp, key, value, headers, callback, time.milliseconds())); dq.addLast(batch);
incomplete.add(batch); // Don't deallocate this buffer in the finally block as it's being used in the record batch
buffer = null;
return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true);
}
} finally {
if (buffer != null)
free.deallocate(buffer);
appendsInProgress.decrementAndGet();
}
}

四、BufferPool

ByteBuffer的创建和释放比较消耗资源,所以Kafka客户端使用BufferPool来实现ByteBuffer的复用。


public class BufferPool { private final long totalMemory;
private final int poolableSize;
private final ReentrantLock lock;
private final Deque<ByteBuffer> free;
// ...
}

五、Sender

run()方法

 /**
* Run a single iteration of sending
*
* @param now The current POSIX time in milliseconds
*/
void run(long now) {
// 省略前面的代码
long pollTimeout = sendProducerData(now);
client.poll(pollTimeout, now);
}

sendProducerData()方法


private long sendProducerData(long now) { // 获取Kafka集群元数据
Cluster cluster = metadata.fetch(); // get the list of partitions with data ready to send
// 获取准备发送数据的partition列表
// ReadyCheckResult的成员变量有Set<Node> readyNodes(准备发送的节点),Set<String> unknownLeaderTopics
RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now); // if there are any partitions whose leaders are not known yet, force metadata update
// 如果有不知道leader节点在哪的partition,强制更新元数据
// 向元数据中增加topic,并且进行更新
if (!result.unknownLeaderTopics.isEmpty()) {
// The set of topics with unknown leader contains topics with leader election pending as well as
// topics which may have expired. Add the topic again to metadata to ensure it is included
// and request metadata update, since there are messages to send to the topic.
for (String topic : result.unknownLeaderTopics)
this.metadata.add(topic);
this.metadata.requestUpdate();
} // remove any nodes we aren't ready to send to
// 移除所有不准备发送的节点
Iterator<Node> iter = result.readyNodes.iterator();
long notReadyTimeout = Long.MAX_VALUE;
while (iter.hasNext()) {
Node node = iter.next();
if (!this.client.ready(node, now)) {
iter.remove();
notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now));
}
} // create produce requests
// 创建一个produce 请求
Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes,
this.maxRequestSize, now);
if (guaranteeMessageOrder) {
// Mute all the partitions drained
for (List<ProducerBatch> batchList : batches.values()) {
for (ProducerBatch batch : batchList)
this.accumulator.mutePartition(batch.topicPartition);
}
} List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(this.requestTimeout, now);
// Reset the producer id if an expired batch has previously been sent to the broker. Also update the metrics
// for expired batches. see the documentation of @TransactionState.resetProducerId to understand why
// we need to reset the producer id here.
if (!expiredBatches.isEmpty())
log.trace("Expired {} batches in accumulator", expiredBatches.size());
for (ProducerBatch expiredBatch : expiredBatches) {
failBatch(expiredBatch, -1, NO_TIMESTAMP, expiredBatch.timeoutException(), false);
if (transactionManager != null && expiredBatch.inRetry()) {
// This ensures that no new batches are drained until the current in flight batches are fully resolved.
transactionManager.markSequenceUnresolved(expiredBatch.topicPartition);
}
} sensors.updateProduceRequestMetrics(batches); // If we have any nodes that are ready to send + have sendable data, poll with 0 timeout so this can immediately
// loop and try sending more data. Otherwise, the timeout is determined by nodes that have partitions with data
// that isn't yet sendable (e.g. lingering, backing off). Note that this specifically does not include nodes
// with sendable data that aren't ready to send since they would cause busy looping.
long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
if (!result.readyNodes.isEmpty()) {
log.trace("Nodes with data ready to send: {}", result.readyNodes);
// if some partitions are already ready to be sent, the select time would be 0;
// otherwise if some partition already has some data accumulated but not ready yet,
// the select time will be the time difference between now and its linger expiry time;
// otherwise the select time will be the time difference between now and the metadata expiry time;
pollTimeout = 0;
}
sendProduceRequests(batches, now); return pollTimeout;
}

参考文档

kafka源码编译

Kafka 源码解析之 Producer 发送模型(一)

Apache Kafka源码剖析

详解Kafka Producer

Kafka(四) —— KafkaProducer源码阅读的更多相关文章

  1. 36 网络相关函数(四)——live555源码阅读(四)网络

    36 网络相关函数(四)——live555源码阅读(四)网络 36 网络相关函数(四)——live555源码阅读(四)网络 简介 7)createSocket创建socket方法 8)closeSoc ...

  2. 【原】AFNetworking源码阅读(四)

    [原]AFNetworking源码阅读(四) 本文转载请注明出处 —— polobymulberry-博客园 1. 前言 上一篇还遗留了很多问题,包括AFURLSessionManagerTaskDe ...

  3. 【原】SDWebImage源码阅读(四)

    [原]SDWebImage源码阅读(四) 本文转载请注明出处 —— polobymulberry-博客园 1. 前言 SDWebImage中主要实现了NSURLConnectionDataDelega ...

  4. 40 网络相关函数(八)——live555源码阅读(四)网络

    40 网络相关函数(八)——live555源码阅读(四)网络 40 网络相关函数(八)——live555源码阅读(四)网络 简介 15)writeSocket向套接口写数据 TTL的概念 函数send ...

  5. 39 网络相关函数(七)——live555源码阅读(四)网络

    39 网络相关函数(七)——live555源码阅读(四)网络 39 网络相关函数(七)——live555源码阅读(四)网络 简介 14)readSocket从套接口读取数据 recv/recvfrom ...

  6. 38 网络相关函数(六)——live555源码阅读(四)网络

    38 网络相关函数(六)——live555源码阅读(四)网络 38 网络相关函数(六)——live555源码阅读(四)网络 简介 12)makeSocketNonBlocking和makeSocket ...

  7. 37 网络相关函数(五)——live555源码阅读(四)网络

    37 网络相关函数(五)——live555源码阅读(四)网络 37 网络相关函数(五)——live555源码阅读(四)网络 简介 10)MAKE_SOCKADDR_IN构建sockaddr_in结构体 ...

  8. 35 网络相关函数(三)——live555源码阅读(四)网络

    35 网络相关函数(三)——live555源码阅读(四)网络 35 网络相关函数(三)——live555源码阅读(四)网络 简介 5)NoReuse不重用地址类 6)initializeWinsock ...

  9. 34 网络相关函数(二)——live555源码阅读(四)网络

    34 网络相关函数(二)——live555源码阅读(四)网络 34 网络相关函数(二)——live555源码阅读(四)网络 2)socketErr 套接口错误 3)groupsockPriv函数 4) ...

随机推荐

  1. crunch制作字典

    安装 安装crunch sudo apt-get install crunch 语法 crunch <min> max<max> <characterset> -t ...

  2. head 与 tail

    head head [-n] 数字『文件』 显示前面n行 例如 head -n 3 test 显示 test 文件的前 3 行,也可以写作 head -3 test 比较有趣的是 -n 后面的数字,可 ...

  3. Android存储及getCacheDir()、getFilesDir()、getExternalFilesDir()、getExternalCacheDir()区别

    存储介绍 Android系统分为内部存储和外部存储,内部存储是手机系统自带的存储,一般空间都比较小,外部存储一般是SD卡的存储,空间一般都比较大,但不一定可用或者剩余空间可能不足.一般我们存储内容都会 ...

  4. 升级tinyhttpd-0.1.0,让其支持网页显示图像

    tinyhttpd是学习http协议非常好的工具,但是由于其过于简单,不支持在网页上显示图片,所以我改了一些代码,让tinyhttpd可以现实图像,供新手一起学习和熟悉http协议,ubuntu14. ...

  5. Python学习笔记-字符串与注释

    OSX从Pthon 2.7升级到3.8 1,官网下载最新安装包安装 2,执行认证文件command 3,执行设置path command 4,设置默认python 在terminal里运行open   ...

  6. 得到List<HashTable>里面的list然后取list的某一项

    //得到List<HashTable>里面的listUnFix然后取listUnFix判断tempfix里面得值 List<Hashtable> list = new List ...

  7. python系列:二、Urllib库的高级用法

    1.设置Headers 有些网站不会同意程序直接用上面的方式进行访问,如果识别有问题,那么站点根本不会响应,所以为了完全模拟浏览器的工作,我们需要设置一些Headers 的属性. 打开我们的浏览器,调 ...

  8. angular 升级到angular8 以及报错信息解决

    1.升级全局angular-cli npm install -g @angular/cli@latest 2.升级项目内 angular-cli (在需要升级的项目中运行) npm i @angula ...

  9. vue 修改对象或数组的属性

  10. Mybatis框架-@Param注解

    回顾一下上一个小demo中存在的问题,是是根据用户的id修改用户的密码,我们只是修改了用户的密码,结果我们的在写接口方法的时候掺入的参数确实一个User对象,这样让别人看到我们的代码真的是很难读懂啊! ...