在broker的配置中,auto.leader.rebalance.enable (false)

那么这个leader是如何进行rebalance的?

首先在controller启动的时候会打开一个scheduler,

if (config.autoLeaderRebalanceEnable) { //如果打开outoLeaderRebalance,需要把partiton leader由于dead而发生迁徙的,重新迁徙回去
info("starting the partition rebalance scheduler")
autoRebalanceScheduler.startup()
autoRebalanceScheduler.schedule("partition-rebalance-thread", checkAndTriggerPartitionRebalance,
5, config.leaderImbalanceCheckIntervalSeconds, TimeUnit.SECONDS)
}

定期去做,

checkAndTriggerPartitionRebalance

这个函数逻辑,就是找出所有发生过迁移的replica,即

topicsNotInPreferredReplica

并且判断如果满足imbalance比率,即自动触发leader rebalance,将leader迁回perfer replica

关键要理解什么是preferred replicas?

preferredReplicasForTopicsByBrokers =
controllerContext.partitionReplicaAssignment.filterNot(p => deleteTopicManager.isTopicQueuedUpForDeletion(p._1.topic)).groupBy {
case(topicAndPartition, assignedReplicas) => assignedReplicas.head
}
 partitionReplicaAssignment: mutable.Map[TopicAndPartition, Seq[Int]] 
TopicAndPartition可以通过topic name和partition id来唯一标识一个partition,Seq[int],表示brokerids,表明这个partition的replicas在哪些brokers上面

从partition的ReplicaAssignment里面过滤掉delete的topic,然后按照assignedReplicas.head进行groupby,就是按照Seq中的第一个brokerid

意思就是说,默认每个partition的preferred replica就是第一个被assign的replica

groupby的结果就是,每个broker,和应该以该broker作为leader的所有partition,即

case(leaderBroker, topicAndPartitionsForBroker)

那么找出里面当前leader不是preferred的,即发生过迁移的,

很简单,直接和leaderAndIsr里面的leader进行比较,如果不相等就说明发生过迁徙

topicsNotInPreferredReplica =
topicAndPartitionsForBroker.filter {
case(topicPartition, replicas) => {
controllerContext.partitionLeadershipInfo.contains(topicPartition) &&
controllerContext.partitionLeadershipInfo(topicPartition).leaderAndIsr.leader != leaderBroker
}
}

并且只有当某个broker上的imbalanceRatio大于10%的时候,才会触发rebalance

imbalanceRatio = totalTopicPartitionsNotLedByBroker.toDouble / totalTopicPartitionsForBroker

对每个partition的迁移过程,

首先preferred的broker要是活着的,并且当前是没有partition正在进行reassign或replica election的,说明这个过程是不能并行的,同时做reassign很容易冲突

// do this check only if the broker is live and there are no partitions being reassigned currently
// and preferred replica election is not in progress
if (controllerContext.liveBrokerIds.contains(leaderBroker) &&
controllerContext.partitionsBeingReassigned.size == 0 &&
controllerContext.partitionsUndergoingPreferredReplicaElection.size == 0 &&
!deleteTopicManager.isTopicQueuedUpForDeletion(topicPartition.topic) &&
controllerContext.allTopics.contains(topicPartition.topic)) {
onPreferredReplicaElection(Set(topicPartition), true)

onPreferredReplicaElection

还是通过partitionStateMachine,来改变partition的状态

partitionStateMachine.handleStateChanges(partitions, OnlinePartition, preferredReplicaPartitionLeaderSelector)

partitionStateMachine会另外分析,这里只需要知道,当前partition的状态是,OnlinePartition –> OnlinePartition

并且是以preferredReplicaPartitionLeaderSelector,作为leaderSelector的策略

 

PreferredReplicaPartitionLeaderSelector

策略很简单,就是把leader换成preferred replica

def selectLeader(topicAndPartition: TopicAndPartition, currentLeaderAndIsr: LeaderAndIsr): (LeaderAndIsr, Seq[Int]) = {
val assignedReplicas = controllerContext.partitionReplicaAssignment(topicAndPartition)
val preferredReplica = assignedReplicas.head //取AR第一个replica作为preferred
// check if preferred replica is the current leader
val currentLeader = controllerContext.partitionLeadershipInfo(topicAndPartition).leaderAndIsr.leader
if (currentLeader == preferredReplica) { //如果当前leader就是preferred就不需要做了
throw new LeaderElectionNotNeededException("Preferred replica %d is already the current leader for partition %s" .format(preferredReplica, topicAndPartition))
} else {
info("Current leader %d for partition %s is not the preferred replica.".format(currentLeader, topicAndPartition) + " Trigerring preferred replica leader election")
// check if preferred replica is not the current leader and is alive and in the isr
if (controllerContext.liveBrokerIds.contains(preferredReplica) && currentLeaderAndIsr.isr.contains(preferredReplica)) { //判断当前preferred replica所在broker是否活,是否在isr中
(new LeaderAndIsr(preferredReplica, currentLeaderAndIsr.leaderEpoch + 1, currentLeaderAndIsr.isr, currentLeaderAndIsr.zkVersion + 1), assignedReplicas) //产生新的leaderAndIsr
} else {
throw new StateChangeFailedException("Preferred replica %d for partition ".format(preferredReplica) +
"%s is either not alive or not in the isr. Current leader and ISR: [%s]".format(topicAndPartition, currentLeaderAndIsr))
}
}
}
}

Apache Kafka源码分析 - autoLeaderRebalanceEnable的更多相关文章

  1. Apache Kafka源码分析 – Broker Server

    1. Kafka.scala 在Kafka的main入口中startup KafkaServerStartable, 而KafkaServerStartable这是对KafkaServer的封装 1: ...

  2. apache kafka源码分析-Producer分析---转载

    原文地址:http://www.aboutyun.com/thread-9938-1-1.html 问题导读1.Kafka提供了Producer类作为java producer的api,此类有几种发送 ...

  3. Apache Kafka源码分析 - kafka controller

    前面已经分析过kafka server的启动过程,以及server所能处理的所有的request,即KafkaApis 剩下的,其实关键就是controller,以及partition和replica ...

  4. Apache Kafka源码分析 – Controller

    https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Controller+Internalshttps://cwiki.apache.org ...

  5. Apache Kafka源码分析 – Log Management

    LogManager LogManager会管理broker上所有的logs(在一个log目录下),一个topic的一个partition对应于一个log(一个log子目录)首先loadLogs会加载 ...

  6. Apache Kafka源码分析 - KafkaApis

    kafka apis反映出kafka broker server可以提供哪些服务,broker server主要和producer,consumer,controller有交互,搞清这些api就清楚了 ...

  7. Apache Kafka源码分析 – Replica and Partition

    Replica 对于local replica, 需要记录highWatermarkValue,表示当前已经committed的数据对于remote replica,需要记录logEndOffsetV ...

  8. Apache Kafka源码分析 - ReplicaStateMachine

    startup 在onControllerFailover中被调用, /** * Invoked on successful controller election. First registers ...

  9. Apache Kafka源码分析 - PartitionStateMachine

    startup 在onControllerFailover中被调用, initializePartitionState private def initializePartitionState() { ...

随机推荐

  1. hdu 4021 n数码

    好题,6666 转自:http://www.cnblogs.com/kuangbin/archive/2012/08/23/2652410.html 题意:给出一个board,上面有24个位置,其中2 ...

  2. Spark Streaming中向flume拉取数据

    在这里看到的解决方法 https://issues.apache.org/jira/browse/SPARK-1729 请是个人理解,有问题请大家留言. 其实本身flume是不支持像KAFKA一样的发 ...

  3. 多功能扫描打印读卡一体手持POS终端

    以往便利店或者超市,前台那个笨重的POS机和站在POS机后的收银员.传统的零售店中,笨重的POS机随处可见. 一个顾客要结账,就需要通过POS机.小票打印机.刷卡器等的配合才能实现.店面需要盘点,整理 ...

  4. 递推DP URAL 1119 Metro

    题目传送门 /* 题意:已知起点(1,1),终点(n,m):从一个点水平或垂直走到相邻的点距离+1,还有k个抄近道的对角线+sqrt (2.0): 递推DP:仿照JayYe,处理的很巧妙,学习:) 好 ...

  5. AMPPZ2014

    [AMPPZ2014]The Lawyer 记录每天结束的最早的会议以及开始的最晚的会议即可. #include<cstdio> #define N 500010 int n,m,i,d, ...

  6. 看看 JDK 8 给我们带来什么(转)

    世界正在缓慢而稳步的改变.这次改变给我们带来了一个新模样的JDK7,java社区也在一直期盼着在JDK8,也许是JDK9中出现一些其他的改进.JDK8的改进目标是填补JDK7实现中的一些空白——部分计 ...

  7. 用edtftpj实现Java FTP客户端工具

    edtftpj是一个java FTP工具包,使用非常方便,感觉比Apache的好用,但Apache更灵活.edtftpj有多种版本,分别是java..net和js版本.对于Java版的有一个免费版本. ...

  8. OpenCV 3.0 VS2010 Configuration

    Add in the system Path: C:\opencv\build\x86\vc10\bin; Project->Project Property->Configuration ...

  9. float塌陷有关问题

    程序代码需要用到的CSS样式body{ margin:0px; padding:0px; text-align:center; font:Arial, Helvetica, sans-serif; f ...

  10. [MobilewebApp]图片的适配与清晰度

    iPhone4s的屏幕分辨率是640x960,这样就带来一个问题: 原来设计的320x480的设计出来的icon等图片,在高分辨率下就会显得模糊. 在经过讨论.查阅资料和测试后,可以有方法解决哦~ 1 ...