Why do Kafka consumers connect to zookeeper, and producers get metadata from brokers?


|
Why is it that consumers connect to zookeeper to retrieve the partition locations? And kafka producers have to connect to one of the brokers to retrieve metadata. My point is, what exactly is the use of zookeeper when every broker already has all the necessary metadata to tell producers the location to send their messages? Couldn't the brokers send this same information to the consumers? I can understand why brokers have the metadata, to not have to make a connection to zookeeper each time a new message is sent to them. Is there a function that zookeeper has that I'm missing? I'm finding it hard to think of a reason why zookeeper is really needed within a kafka cluster. |
|||
|
First of all, zookeeper is needed only for high level consumer. The main reason zookeeper is needed for a high level consumer is to track consumed offsets and handle load balancing. Now in more detail. Regarding offset tracking, imagine following scenario: you start a consumer, consume 100 messages and shut the consumer down. Next time you start your consumer you'll probably want to resume from your last consumed offset (which is 100), and that means you have to store the maximum consumed offset somewhere. Here's where zookeeper kicks in: it stores offsets for every group/topic/partition. So this way next time you start your consumer it may ask "hey zookeeper, what's the offset I should start consuming from?". Kafka is actually moving towards being able to store offsets not only in zookeeper, but in other storages as well (for now only Regarding load balancing, the amount of messages produced can be quite large to be handled by 1 machine and you'll probably want to add computing power at some point. Lets say you have a topic with 100 partitions and to handle this amount of messages you have 10 machines. There are several questions that arise here actually:
And again, here's where zookeeper kicks in: it tracks all consumers in group and each high level consumer is subscribed for changes in this group. The point is that when a consumer appears or disappears, zookeeper notifies all consumers and triggers rebalance so that they split partitions near-equally (e.g. to balance load). This way it guarantees if one of consumer dies others will continue processing partitions that were owned by this consumer. |
|||||
|


Why do Kafka consumers connect to zookeeper, and producers get metadata from brokers?的更多相关文章
- kafka集群和zookeeper集群的部署,kafka的java代码示例
来自:http://doc.okbase.net/QING____/archive/19447.html 也可参考: http://blog.csdn.net/21aspnet/article/det ...
- 使用不同的namespace让不同的kafka/Storm连接同一个zookeeper
背景介绍: 需要部署2个kafka独立环境,但是只有一个zookeeper集群. 需要部署2个独立的storm环境,但是只有一个zookeeper集群. ----------------------- ...
- org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeo ...
- Unable to connect to zookeeper server within timeout: 5000
错误 严重: StandardWrapper.Throwable org.springframework.beans.factory.BeanCreationException: Error crea ...
- CentOS7 搭建Kafka(一)zookeeper篇
CentOS7 搭建Kafka(一)zookeeper篇 近几年当红小生Kafka备受各路英雄好汉追捧,一点不比老前辈RabbitMQ和ActiveMQ差,因为流行,所以你就得学啊:我这么懒,肯定是不 ...
- Caused by: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 5000
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'brandControl ...
- apache kafka系列之在zookeeper中存储结构
1.topic注册信息 /brokers/topics/[topic] : 存储某个topic的partitions所有分配信息 Schema: { "version": ...
- Kafka学习之(五)搭建kafka集群之Zookeeper集群搭建
Zookeeper是一种在分布式系统中被广泛用来作为:分布式状态管理.分布式协调管理.分布式配置管理.和分布式锁服务的集群.kafka增加和减少服务器都会在Zookeeper节点上触发相应的事件kaf ...
- kafka集群与zookeeper集群 配置过程
Kafka的集群配置一般有三种方法,即 (1)Single node – single broker集群: (2)Single node – multiple broker集群: (3)Mult ...
随机推荐
- 阿里云IoT
阿里云IoT: https://iot.aliyun.com/
- C#异步线程
对需要同时进行的操作进行异步线程处理:例如在一个button按钮点击事件中同时进行两种事件操作private void button_Click(object sender, EventArgs e) ...
- spring_03ApplicationContext三种经常用到的实现
1.ClassPathXmlApplicationContext从类路径加载 ApplicationContext ac=new ClassPathXmlApplicationContext(&quo ...
- Java学习笔记之——内部类
内部类 形式:把一个类定义在一个类的内部. 分为:成员内部类和匿名内部类重点掌握 a) 成员内部类 b) 静态成员内部类 c) 匿名内部类 d) 局部内部类 (1)成员内部类: Java的语言是面向对 ...
- 史上最全python面试题详解(三)(附带详细答案(关注、持续更新))
38.面向对象深度优先和广度优先是什么? 39.面向对象中super的作用? 40.是否使用过functools中的函数?其作用是什么? Python自带的 functools 模块提供了一些常用的高 ...
- 深入理解SpringCloud与微服务构建
旭日Follow_24 的CSDN 博客 ,全文地址请点击: https://blog.csdn.net/xuri24/article/details/81742534 目录 一.SpringClou ...
- JAVA 多线程(4)
接着3说: 一.String常量池 先回顾 java 的基本数据类型: 变量就是申请内存来存储值.也就是说,当创建变量的时候,需要在内存中申请空间. 内存管理系统根据变量的类型为变量分配存储空间,分配 ...
- jQuery 练习:取出数组字典的值, 静态对话框, clone方法应用
jQuery 中文API文档 http://jquery.cuishifeng.cn/ jQuery 取出数组字典的值 <head> <meta charset="UTF- ...
- 将HTML页面自动保存为PDF文件并上传的两种方式(一)-前端(react)方式
一.业务场景 公司的样本检测报告以React页面的形式生成,已调整为A4大小的样式并已实现分页,业务上需要将这个网页生成PDF文件,并上传到服务器,后续会将这个文件发送给客户(这里不考虑). 二.原来 ...
- IIS搭建Web服务器,外网可以访问,但无法加载视频
错误提示如下: 可能原因: IIS的MIME中未注册MP4.ogg.webm相关类型,导致IIS无法识别 解决方法: 在IIS中注册MP4.ogg.webm类型,以下以MP4为例,ogg和webm以此 ...
