Apache Kafka源码分析 - autoLeaderRebalanceEnable

在broker的配置中，auto.leader.rebalance.enable (false)

那么这个leader是如何进行rebalance的？

首先在controller启动的时候会打开一个scheduler，

if (config.autoLeaderRebalanceEnable) { //如果打开outoLeaderRebalance，需要把partiton leader由于dead而发生迁徙的，重新迁徙回去

        info("starting the partition rebalance scheduler")

        autoRebalanceScheduler.startup()

        autoRebalanceScheduler.schedule("partition-rebalance-thread", checkAndTriggerPartitionRebalance,

          5, config.leaderImbalanceCheckIntervalSeconds, TimeUnit.SECONDS)

      }

定期去做,

checkAndTriggerPartitionRebalance

这个函数逻辑，就是找出所有发生过迁移的replica，即

topicsNotInPreferredReplica

并且判断如果满足imbalance比率，即自动触发leader rebalance，将leader迁回perfer replica

关键要理解什么是preferred replicas？

preferredReplicasForTopicsByBrokers =

          controllerContext.partitionReplicaAssignment.filterNot(p => deleteTopicManager.isTopicQueuedUpForDeletion(p._1.topic)).groupBy {

            case(topicAndPartition, assignedReplicas) => assignedReplicas.head

          }

 partitionReplicaAssignment: mutable.Map[TopicAndPartition, Seq[Int]]

TopicAndPartition可以通过topic name和partition id来唯一标识一个partition，Seq[int],表示brokerids，表明这个partition的replicas在哪些brokers上面

从partition的ReplicaAssignment里面过滤掉delete的topic，然后按照assignedReplicas.head进行groupby，就是按照Seq中的第一个brokerid

意思就是说，默认每个partition的preferred replica就是第一个被assign的replica

groupby的结果就是，每个broker，和应该以该broker作为leader的所有partition，即

case(leaderBroker, topicAndPartitionsForBroker)

那么找出里面当前leader不是preferred的，即发生过迁移的，

很简单，直接和leaderAndIsr里面的leader进行比较，如果不相等就说明发生过迁徙

topicsNotInPreferredReplica =

              topicAndPartitionsForBroker.filter {

                case(topicPartition, replicas) => {

                  controllerContext.partitionLeadershipInfo.contains(topicPartition) &&

                  controllerContext.partitionLeadershipInfo(topicPartition).leaderAndIsr.leader != leaderBroker

                }

              }

并且只有当某个broker上的imbalanceRatio大于10%的时候，才会触发rebalance

imbalanceRatio = totalTopicPartitionsNotLedByBroker.toDouble / totalTopicPartitionsForBroker

对每个partition的迁移过程，

首先preferred的broker要是活着的，并且当前是没有partition正在进行reassign或replica election的，说明这个过程是不能并行的，同时做reassign很容易冲突

// do this check only if the broker is live and there are no partitions being reassigned currently

                  // and preferred replica election is not in progress

                  if (controllerContext.liveBrokerIds.contains(leaderBroker) &&

                      controllerContext.partitionsBeingReassigned.size == 0 &&

                      controllerContext.partitionsUndergoingPreferredReplicaElection.size == 0 &&

                      !deleteTopicManager.isTopicQueuedUpForDeletion(topicPartition.topic) &&

                      controllerContext.allTopics.contains(topicPartition.topic)) {

                    onPreferredReplicaElection(Set(topicPartition), true)

onPreferredReplicaElection

还是通过partitionStateMachine，来改变partition的状态

partitionStateMachine.handleStateChanges(partitions, OnlinePartition, preferredReplicaPartitionLeaderSelector)

partitionStateMachine会另外分析，这里只需要知道，当前partition的状态是，OnlinePartition –> OnlinePartition

并且是以preferredReplicaPartitionLeaderSelector，作为leaderSelector的策略

PreferredReplicaPartitionLeaderSelector

策略很简单，就是把leader换成preferred replica

def selectLeader(topicAndPartition: TopicAndPartition, currentLeaderAndIsr: LeaderAndIsr): (LeaderAndIsr, Seq[Int]) = {

    val assignedReplicas = controllerContext.partitionReplicaAssignment(topicAndPartition)

    val preferredReplica = assignedReplicas.head  //取AR第一个replica作为preferred

    // check if preferred replica is the current leader

    val currentLeader = controllerContext.partitionLeadershipInfo(topicAndPartition).leaderAndIsr.leader

    if (currentLeader == preferredReplica) { //如果当前leader就是preferred就不需要做了

      throw new LeaderElectionNotNeededException("Preferred replica %d is already the current leader for partition %s" .format(preferredReplica, topicAndPartition))

    } else {

      info("Current leader %d for partition %s is not the preferred replica.".format(currentLeader, topicAndPartition) + " Trigerring preferred replica leader election")

      // check if preferred replica is not the current leader and is alive and in the isr

      if (controllerContext.liveBrokerIds.contains(preferredReplica) && currentLeaderAndIsr.isr.contains(preferredReplica)) { //判断当前preferred replica所在broker是否活，是否在isr中

        (new LeaderAndIsr(preferredReplica, currentLeaderAndIsr.leaderEpoch + 1, currentLeaderAndIsr.isr, currentLeaderAndIsr.zkVersion + 1), assignedReplicas) //产生新的leaderAndIsr

      } else {

        throw new StateChangeFailedException("Preferred replica %d for partition ".format(preferredReplica) +

          "%s is either not alive or not in the isr. Current leader and ISR: [%s]".format(topicAndPartition, currentLeaderAndIsr))

      }

    }

  }

}

Apache Kafka源码分析 - autoLeaderRebalanceEnable的更多相关文章

Apache Kafka源码分析 – Broker Server
1. Kafka.scala 在Kafka的main入口中startup KafkaServerStartable, 而KafkaServerStartable这是对KafkaServer的封装 1: ...
apache kafka源码分析-Producer分析---转载
原文地址:http://www.aboutyun.com/thread-9938-1-1.html 问题导读1.Kafka提供了Producer类作为java producer的api,此类有几种发送 ...
Apache Kafka源码分析 - kafka controller
前面已经分析过kafka server的启动过程,以及server所能处理的所有的request,即KafkaApis 剩下的,其实关键就是controller,以及partition和replica ...
Apache Kafka源码分析 – Controller
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internalshttps://cwiki.apache.org ...
Apache Kafka源码分析 – Log Management
LogManager LogManager会管理broker上所有的logs(在一个log目录下),一个topic的一个partition对应于一个log(一个log子目录)首先loadLogs会加载 ...
Apache Kafka源码分析 - KafkaApis
kafka apis反映出kafka broker server可以提供哪些服务,broker server主要和producer,consumer,controller有交互,搞清这些api就清楚了 ...
Apache Kafka源码分析 – Replica and Partition
Replica 对于local replica, 需要记录highWatermarkValue,表示当前已经committed的数据对于remote replica,需要记录logEndOffsetV ...
Apache Kafka源码分析 - ReplicaStateMachine
startup 在onControllerFailover中被调用, /** * Invoked on successful controller election. First registers ...
Apache Kafka源码分析 - PartitionStateMachine
startup 在onControllerFailover中被调用, initializePartitionState private def initializePartitionState() { ...

随机推荐

x264_param_t结构体参数分析
转自:http://blog.chinaunix.net/uid-17053077-id-1987955.html 参考网上的一些资料,结合个人的理解,对x264中x264_param_t结构体作了初 ...
Gym 100637F F. The Pool for Lucky Ones
F. The Pool for Lucky Ones Time Limit: 20 Sec Memory Limit: 256 MB 题目连接 http://codeforces.com/gym/10 ...
关于LR中的EXTRARES
LoadRunner脚本之EXTRARES参数 EXTRARES:分隔符,表示标记下一个属性是资源属性的列表(list of resource attributes). [EXTRARES后的资源是由 ...
mysql优化连接数防止访问量过高的方法
这篇文章主要介绍了mysql优化连接数防止访问量过高的方法,需要的朋友可以参考下很多开发人员都会遇见”MySQL: ERROR 1040: Too many connections”的异常情况,造成 ...
全自动编译FFmpeg(含x264，fdk aac，libmp3lame，libvpx等第3方库)
需要修改 #存放下载的源代码目录compile_dir=/root/ffmpeg_compile #库文件安装目录prefix_dir=/mnt/third-party 运行方法: source ce ...
记忆化搜索(DP+DFS) URAL 1183 Brackets Sequence
题目传送门 /* 记忆化搜索(DP+DFS):dp[i][j] 表示第i到第j个字符,最少要加多少个括号 dp[x][x] = 1 一定要加一个括号:dp[x][y] = 0, x > y; 当 ...
POJ2047 Concert Hall Scheduling（最小费用最大流）
题目大概是有两个音乐厅,有n个乐队申请音乐厅,他们必须从第ii天到第ji天连续开音乐会且他们的开价是wi,每天每个音乐厅都只能供一个乐队进行音乐会.问接受哪些乐队的申请,获利最多能多少. 这题相当于在 ...
POJ1155 TELE（树形DP）
题目是说给一棵树,叶子结点有负权,边有正权,问最多能选多少个叶子结点,使从叶子到根的权值和小于等于0. 考虑数据规模表示出状态:dp[u][k]表示在u结点为根的子树中选择k个叶子结点的最小权值最后 ...
HDU2841 Visible Trees（容斥原理）
题目..大概就是有个m*n个点的矩形从(1,1)到(m,n),问从(0,0)出发直线看过去最多能看到几个点. 如果(0,0)->(x,y)和(0,0)->(x',y')两个向量平行,那后面 ...
POJ1523 SPF（割点模板）
题目求一个无向图的所有割点,并输出删除这些割点后形成几个连通分量.用Tarjan算法: 一遍DFS,构造出一颗深度优先生成树,在原无向图中边分成了两种:树边(生成树上的边)和反祖边(非生成树上的边). ...

Apache Kafka源码分析 - autoLeaderRebalanceEnable

Apache Kafka源码分析 - autoLeaderRebalanceEnable的更多相关文章

随机推荐

热门专题