kafka高可用探究

众所周知 kafka 的 topic 可以使用 --replication-factor 数和 partitions 数来保证服务的高可用性

问题发现

但在最近的运维过程中,3台集群的kafka,副本与分区都为3,有其中一台 broker 挂了导致整个集群成了不可用状态,消费者消费不到信息,这是为什么呢?

查了很多资料后发现是kafka本身的 topic __consumer_offsets 搞的鬼。

问题分析

在高版本的kakfa中,消费者的offset偏移量会保存在kafka自身一个叫做__consumer_offsets的topic中,由于这个topic是由kafka本身默认创建,所以副本数会配置文件中指定的默认副本数,一般为1。

查看副本分区情况一般为:

./kafka-topics.sh --zookeeper localhost:2181 --describe __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 3 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 6 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 7 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 8 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 9 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 10 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 11 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 12 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 14 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 15 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 16 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 17 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 18 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 19 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 20 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 21 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 22 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 23 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 24 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 25 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 26 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 27 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 28 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 29 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 30 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 31 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 32 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 33 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 34 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 35 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 36 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 37 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 38 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 39 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 40 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 41 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 42 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 43 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 44 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 45 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 46 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 47 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 48 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 49 Leader: 1 Replicas: 1 Isr: 1

50个分区,每个分区1个副本。

50个分区是遍布在3台broker的,这就导致如果有其中一台broker服务挂了,在其broker的所有Partition将不能正常使用,就导致此Partition的消费者不知道自己的offset偏移量,就导致无法正常消费。

问题解决

方法1

由于现在kafka已经开始正常提供服务,所以只能动态修改:

先准备分区副本规划 json 文件

vim /data/vfan/consumer.json

{
"version": 1,
"partitions": [
{
"topic": "__consumer_offsets",
"partition": 0,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 1,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 2,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 3,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 4,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 5,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 6,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 7,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 8,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 9,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 10,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 11,
"replicas": [
3,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 12,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 13,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 14,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 15,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 16,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 17,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 18,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 19,
"replicas": [
3,
2,
1
]
},

{
"topic": "__consumer_offsets",
"partition": 20,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 21,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 22,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 23,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 24,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 25,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 26,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 27,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 28,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 29,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 30,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 31,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 32,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 33,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 34,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 35,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 36,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 37,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 38,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 39,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 40,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 41,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 42,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 43,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 44,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 45,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 46,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 47,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 48,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 49,
"replicas": [
3,
2,
1
]
}
]
}
各 replicas 所在的 broker id可以自定义修改,但不能有重复的broker

开始执行变更

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --execute

校验变更是否完成

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --verify

检查变更后效果

./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 2 Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
Topic: __consumer_offsets Partition: 3 Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 4 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 5 Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
Topic: __consumer_offsets Partition: 6 Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 7 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 8 Leader: 3 Replicas: 3,2,1 Isr: 2,1,3
Topic: __consumer_offsets Partition: 9 Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 10 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 11 Leader: 3 Replicas: 3,1,2 Isr: 2,3,1
Topic: __consumer_offsets Partition: 12 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 14 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 15 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 16 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 17 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 18 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 19 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
Topic: __consumer_offsets Partition: 20 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 21 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 22 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 23 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 24 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 25 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 26 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 27 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 28 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 29 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 30 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 31 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 32 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 33 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 34 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 35 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 36 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 37 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
Topic: __consumer_offsets Partition: 38 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 39 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 40 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 41 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 42 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 43 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 44 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 45 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 46 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 47 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 48 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 49 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3

副本数已经成为三个并分布在三个broker中,实现高可用。

方法2

直接在kafka服务启动前,修改系统创建topic默认副本分区参数

num.partitions=3 ;当topic不存在系统自动创建时的分区数
default.replication.factor=3 ;当topic不存在系统自动创建时的副本数
offsets.topic.replication.factor=3 ;表示kafka的内部topic consumer_offsets副本数,默认为1

设置完毕后,启动 zk kafka,随后测试生产 消费

## 生产
./kafka-console-producer.sh --broker-list localhost:9092 --topic test

## 消费,--from-beginning参数表示从头开始
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

查看系统生成的topic 分区及副本数

## test
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic test
Topic:test PartitionCount:3 ReplicationFactor:3 Configs:

## __consumer_offsets
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer

系统自动生成的topic也都已实现高可用

kafka高可用探究的更多相关文章

  1. Kafka 高可用设计

    Kafka 高可用设计 2016-02-28 杜亦舒 Kafka在早期版本中,并不提供高可用机制,一旦某个Broker宕机,其上所有Partition都无法继续提供服务,甚至发生数据丢失对于分布式系统 ...

  2. Kafka高可用环境搭建

    Apache Kafka是分布式发布-订阅消息系统,在 kafka官网上对 kafka 的定义:一个分布式发布-订阅消息传递系统. 它最初由LinkedIn公司开发,Linkedin于2010年贡献给 ...

  3. Kafka —— 基于 ZooKeeper 搭建 Kafka 高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本Zookeep ...

  4. Kafka 学习之路(二)—— 基于ZooKeeper搭建Kafka高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本Zookeep ...

  5. Kafka 系列(二)—— 基于 ZooKeeper 搭建 Kafka 高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper 集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本 Zooke ...

  6. 入门大数据---基于Zookeeper搭建Kafka高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper 集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本 Zooke ...

  7. mysql高可用探究 MMM高可用mysql方案

    1    MMM高可用mysql方案 1.1  方案简介 MMM即Master-Master Replication Manager for MySQL(mysql主主复制管理器)关于mysql主主复 ...

  8. Kafka高可用实现原理

    数据存储格式 Kafka的高可靠性的保障来源于其健壮的副本(replication)策略.一个Topic可以分成多个Partition,而一个Partition物理上由多个Segment组成. Seg ...

  9. Kafka高可用实现

    数据存储格式 Kafka的高可靠性的保障来源于其健壮的副本(replication)策略.一个Topic可以分成多个Partition,而一个Partition物理上由多个Segment组成. Seg ...

随机推荐

  1. 【springboot】过滤器、监听器、拦截器,Aspect切片

    转自: https://blog.csdn.net/cp026la/article/details/86501019 简介: 本章介绍拦截器.过滤器.切片对请求拦截的使用与区别,以及监听器在 spri ...

  2. WPF QQ群发助手

    一.界面如下

  3. 天翼云安装jdk(注意有坑)

    1.下载jdk8 查看Linux位数,到oracle官网下载对应的jdk ① sudo uname --m  确认32位还是64位 ② https://www.oracle.com/technetwo ...

  4. Git中使用.gitignore忽略文件的推送

    转载自:https://blog.csdn.net/lk142500/article/details/82869018 windows下可以用另存为生成gitignore 文件 1 简介 在使用Git ...

  5. Servlet常见问题

    时间:2016-12-6 23:18 java.lang.ClassNotFoundException: org.apache.commons.collections.FastHashMapcommo ...

  6. 《redis 5设计与源码分析》:第二章 简单动态字符串

    介绍 简单动态字符串(Simple Dynamic Strings, SDS)是Redis的基本数据结构之一,用于存储字符串和整型数据.它的特点是:方便扩容.二进制安全. 二进制安全 在C语言中,用& ...

  7. Win10安装gcc

    使用MinGW安装gcc 1.下载MinGW,地址 https://sourceforge.net/projects/mingw/files/ ,选择Download mingw-get-setup. ...

  8. Shell中的运算

    1.运算方式及运算符号 2.SHELL 中常用的运算命令 3.相关操作演示 1.用脚本写一个10秒倒计时 脚本的执行: 2.编写脚本,1分10秒的倒计时 执行脚本: 3.编写脚本,制作一个计算器 脚本 ...

  9. 从零开始实现简单 RPC 框架 9:网络通信之心跳与重连机制

    一.心跳 什么是心跳 在 TPC 中,客户端和服务端建立连接之后,需要定期发送数据包,来通知对方自己还在线,以确保 TPC 连接的有效性.如果一个连接长时间没有心跳,需要及时断开,否则服务端会维护很多 ...

  10. shell中的引号

    单引号: 所见即所得 原封不动输出 双引号: 与单引号类似 特殊符号进行解析 ( $ $() `` ! ) 无引号: 与双引号类似 支持通配符( {} * ) 反引号: 优先执行 优先执行里面的命令, ...