kafka consumer 分区reblance算法

转载请注明原创地址 http://www.cnblogs.com/dongxiao-yang/p/6238029.html

最近需要详细研究下kafka reblance过程中分区计算的算法细节，网上搜了部分说法，感觉比较晦涩且不太易懂，还是自己抠源码比较简便一点。

kafka reblance计算部分代码如下：

class RangeAssignor() extends PartitionAssignor with Logging {

  def assign(ctx: AssignmentContext) = {

    val valueFactory = (topic: String) => new mutable.HashMap[TopicAndPartition, ConsumerThreadId]

    val partitionAssignment =

      new Pool[String, mutable.Map[TopicAndPartition, ConsumerThreadId]](Some(valueFactory))

    for (topic <- ctx.myTopicThreadIds.keySet) {

      val curConsumers = ctx.consumersForTopic(topic)

      val curPartitions: Seq[Int] = ctx.partitionsForTopic(topic)

      val nPartsPerConsumer = curPartitions.size / curConsumers.size

      val nConsumersWithExtraPart = curPartitions.size % curConsumers.size

      info("Consumer " + ctx.consumerId + " rebalancing the following partitions: " + curPartitions +

        " for topic " + topic + " with consumers: " + curConsumers)

      for (consumerThreadId <- curConsumers) {

        val myConsumerPosition = curConsumers.indexOf(consumerThreadId)

        assert(myConsumerPosition >= 0)

        val startPart = nPartsPerConsumer * myConsumerPosition + myConsumerPosition.min(nConsumersWithExtraPart)

        val nParts = nPartsPerConsumer + (if (myConsumerPosition + 1 > nConsumersWithExtraPart) 0 else 1)

        /**

         *   Range-partition the sorted partitions to consumers for better locality.

         *  The first few consumers pick up an extra partition, if any.

         */

        if (nParts <= 0)

          warn("No broker partitions consumed by consumer thread " + consumerThreadId + " for topic " + topic)

        else {

          for (i <- startPart until startPart + nParts) {

            val partition = curPartitions(i)

            info(consumerThreadId + " attempting to claim partition " + partition)

            // record the partition ownership decision

            val assignmentForConsumer = partitionAssignment.getAndMaybePut(consumerThreadId.consumer)

            assignmentForConsumer += (TopicAndPartition(topic, partition) -> consumerThreadId)

          }

        }

      }

    }

  def getPartitionsForTopics(topics: Seq[String]): mutable.Map[String, Seq[Int]] = {

    getPartitionAssignmentForTopics(topics).map { topicAndPartitionMap =>

      val topic = topicAndPartitionMap._1

      val partitionMap = topicAndPartitionMap._2

      debug("partition assignment of /brokers/topics/%s is %s".format(topic, partitionMap))

      (topic -> partitionMap.keys.toSeq.sortWith((s,t) => s < t))

    }

  }

  def getConsumersPerTopic(group: String, excludeInternalTopics: Boolean) : mutable.Map[String, List[ConsumerThreadId]] = {

    val dirs = new ZKGroupDirs(group)

    val consumers = getChildrenParentMayNotExist(dirs.consumerRegistryDir)

    val consumersPerTopicMap = new mutable.HashMap[String, List[ConsumerThreadId]]

    for (consumer <- consumers) {

      val topicCount = TopicCount.constructTopicCount(group, consumer, this, excludeInternalTopics)

      for ((topic, consumerThreadIdSet) <- topicCount.getConsumerThreadIdsPerTopic) {

        for (consumerThreadId <- consumerThreadIdSet)

          consumersPerTopicMap.get(topic) match {

            case Some(curConsumers) => consumersPerTopicMap.put(topic, consumerThreadId :: curConsumers)

            case _ => consumersPerTopicMap.put(topic, List(consumerThreadId))

          }

      }

    }

    for ( (topic, consumerList) <- consumersPerTopicMap )

      consumersPerTopicMap.put(topic, consumerList.sortWith((s,t) => s < t))

    consumersPerTopicMap

  }

计算过程主要由上述高亮代码部分实现，举例说明，一个拥有十个分区的topic，相同group拥有三个consumerid为aaa,ccc,bbb的消费者

1 由后两段代码可知，获取consumerid列表和partition分区列表都是已经排好序的，所以

curConsumers=(aaa,bbb,ccc)

curPartitions=(0,1,2,3,4,5,6,7,8,9)

nPartsPerConsumer=10/3 =3

nConsumersWithExtraPart=10%3 =1

3 假设当前客户端id为aaa

myConsumerPosition= curConsumers.indexof(aaa) =0

4 计算分区范围

startPart= 3*0+0.min(1) = 0

nParts = 3+(if (0 + 1 > 1) 0 else 1)=3+1=4

所以aaa对应的分区号为[0,4),即0，1，2，3前面四个分区

同理可得bbb对应myConsumerPosition=1，对应分区4，5，6中间三个分区

ccc对应myConsumerPosition=2，对应7，8，9最后三个分区。

kafka consumer 分区reblance算法的更多相关文章

kafka consumer频繁reblance
转载请注明地址http://www.cnblogs.com/dongxiao-yang/p/5417956.html 结论与下文相同,kafka不同topic的consumer如果用的groupid名 ...
Kafka设计解析（四）- Kafka Consumer设计解析
本文转发自Jason’s Blog,原文链接 http://www.jasongj.com/2015/08/09/KafkaColumn4 摘要本文主要介绍了Kafka High Level Con ...
[Big Data - Kafka] Kafka设计解析（四）：Kafka Consumer解析
High Level Consumer 很多时候,客户程序只是希望从Kafka读取数据,不太关心消息offset的处理.同时也希望提供一些语义,例如同一条消息只被某一个Consumer消费(单播)或被 ...
Kafka Consumer API样例
Kafka Consumer API样例 1. 自动确认Offset 说明参照:http://blog.csdn.net/xianzhen376/article/details/51167333 Pr ...
kafka consumer assign 和 subscribe模式差异分析
转载请注明原创地址:http://www.cnblogs.com/dongxiao-yang/p/7200971.html 最近需要研究flink-connector-kafka的消费行为,发现fli ...
Kafka学习笔记之Kafka Consumer设计解析
0x00 摘要本文主要介绍了Kafka High Level Consumer,Consumer Group,Consumer Rebalance,Low Level Consumer实现的语义,以 ...
初始 Kafka Consumer 消费者
温馨提示:整个 Kafka 专栏基于 kafka-2.2.1 版本. 1.KafkaConsumer 概述根据 KafkaConsumer 类上的注释上来看 KafkaConsumer 具有如下特征 ...
【原创】美团二面：聊聊你对 Kafka Consumer 的架构设计
在上一篇中我们详细聊了关于 Kafka Producer 内部的底层原理设计思想和细节, 本篇我们主要来聊聊 Kafka Consumer 即消费者的内部底层原理设计思想. 1.Consumer之总体 ...
kafka consumer代码梳理
kafka consumer是一个单纯的单线程程序,因此相对于producer会更好理解些.阅读consumer代码的关键是理解回调,因为consumer中使用了大量的回调函数.参看kafka中的回调 ...

随机推荐

《Excel图表之道》读书笔记
一.突破常规的作图方法突破Excel的默认颜色非数据元素用淡色突破Excel的图表布局图表要素:主标题.副标题.图例.绘图.脚注竖向构图标明数据来源.图表注释.坐标轴截断符号专业的水蓝色 ...
学习python网络数据采集笔记-1、2章
英文不好只能看中文版的.邮电出版社翻译的真很烂. 以上是吐槽,以下是正文. 书中用的pthon 3.X版本,建议安装python3.4以上的版本,低版本的没有自带pip安装插件会比较麻烦. 下载地址: ...
hdu 1800 Flying to the Mars
Flying to the Mars 题意:找出题给的最少的递增序列(严格递增)的个数,其中序列中每个数字不多于30位:序列长度不长于3000: input: 4 (n) 10 20 30 04 ou ...
POJ 3026 Borg Maze bfs+Kruskal
题目链接:http://poj.org/problem?id=3026 感觉英语比题目本身难,其实就是个最小生成树,不过要先bfs算出任意两点的权值. #include <stdio.h> ...
实现js的类似alert效果的函数
这个简单的类似alert的函数,效果还行,至于css样式,那没的说了,笔者确实尽力了,如果读者觉得太烂,你可以随便改函数的样式的,反正,笔者觉得还可以,呵呵. <!DOCTYPE html PU ...
解决ubuntu侧边栏固定应用单击无反应的问题
Linux下有些绿色软件,不需要安装就可以双击启动,但有些程序在打开后直接在 Launcher 中右键选择 Lock to Launcher ,但是,有时候单击图标后并未启动应用,下面给出解决方法. ...
form表单提交的方法
最近研究了下html中,form保单提交的几种方法,现与大家分享一下(注:网上可能已经有好多版本了,这里自己写下来做个总结了,哈!): 方法一:利用form的onsubmit()函数(经常使用) &l ...
Form Post
1.当输入用户名和密码为空的时候,需要判断.这时候就用到了校验用户名和密码,这个需要在jsp的前端页面写:有两种方法,一种是用submit提交.一种是用button提交.方法一: 在jsp的前端页面的 ...
JavaScript学习代码整理（二）--函数
//JavaScript函数 //简单的求和函数 function sum(a,b) { return a + b; } //函数可以存储在变量中,也可以通过变量调用函数 x = sum(a,b); ...
中国首个 SaaS 模式的云告警平台 iOS 版 APP 上线
今天上午,国内首个 SaaS 模式的云告警平台 OneAlert 正式发布 ios 版 APP,每个 ios 用户,无需电脑,都可以通过手机全程跟踪所有告警,并且可以和每一个成员一键式电话沟通,团队协 ...

kafka consumer 分区reblance算法

kafka consumer 分区reblance算法的更多相关文章

随机推荐

热门专题