Why is it that consumers connect to zookeeper to retrieve the partition locations? And kafka producers have to connect to one of the brokers to retrieve metadata.

My point is, what exactly is the use of zookeeper when every broker already has all the necessary metadata to tell producers the location to send their messages? Couldn't the brokers send this same information to the consumers?

I can understand why brokers have the metadata, to not have to make a connection to zookeeper each time a new message is sent to them. Is there a function that zookeeper has that I'm missing? I'm finding it hard to think of a reason why zookeeper is really needed within a kafka cluster.

asked Jan 13 '15 at 8:49
Luckl507

8516
 

2 Answers

First of all, zookeeper is needed only for high level consumer. SimpleConsumer does not require zookeeper to work.

The main reason zookeeper is needed for a high level consumer is to track consumed offsets and handle load balancing.

Now in more detail.

Regarding offset tracking, imagine following scenario: you start a consumer, consume 100 messages and shut the consumer down. Next time you start your consumer you'll probably want to resume from your last consumed offset (which is 100), and that means you have to store the maximum consumed offset somewhere. Here's where zookeeper kicks in: it stores offsets for every group/topic/partition. So this way next time you start your consumer it may ask "hey zookeeper, what's the offset I should start consuming from?". Kafka is actually moving towards being able to store offsets not only in zookeeper, but in other storages as well (for now only zookeeper and kafka offset storages are available and i'm not sure kafka storage is fully implemented).

Regarding load balancing, the amount of messages produced can be quite large to be handled by 1 machine and you'll probably want to add computing power at some point. Lets say you have a topic with 100 partitions and to handle this amount of messages you have 10 machines. There are several questions that arise here actually:

  • how should these 10 machines divide partitions between each other?
  • what happens if one of machines die?
  • what happens if you want to add another machine?

And again, here's where zookeeper kicks in: it tracks all consumers in group and each high level consumer is subscribed for changes in this group. The point is that when a consumer appears or disappears, zookeeper notifies all consumers and triggers rebalance so that they split partitions near-equally (e.g. to balance load). This way it guarantees if one of consumer dies others will continue processing partitions that were owned by this consumer.

answered Jan 13 '15 at 9:46
serejja

10k22749
 
1  
Thanks for the answer, this clears it up, it's what i guessed but i couldn't find it anywhere. I also just read that version 0.9 the consumers will no longer use zookeeper, and it is only used by the brokers for leader election etc. – Luckl507 Jan 13 '15 at 9:56 

With kafka 0.9+ the new Consumer API was introduced. New consumers do not need connection to Zookeeper since group balancing is provided by kafka itself.

Why do Kafka consumers connect to zookeeper, and producers get metadata from brokers?的更多相关文章

  1. kafka集群和zookeeper集群的部署,kafka的java代码示例

    来自:http://doc.okbase.net/QING____/archive/19447.html 也可参考: http://blog.csdn.net/21aspnet/article/det ...

  2. 使用不同的namespace让不同的kafka/Storm连接同一个zookeeper

    背景介绍: 需要部署2个kafka独立环境,但是只有一个zookeeper集群. 需要部署2个独立的storm环境,但是只有一个zookeeper集群. ----------------------- ...

  3. org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within

    org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeo ...

  4. Unable to connect to zookeeper server within timeout: 5000

    错误 严重: StandardWrapper.Throwable org.springframework.beans.factory.BeanCreationException: Error crea ...

  5. CentOS7 搭建Kafka(一)zookeeper篇

    CentOS7 搭建Kafka(一)zookeeper篇 近几年当红小生Kafka备受各路英雄好汉追捧,一点不比老前辈RabbitMQ和ActiveMQ差,因为流行,所以你就得学啊:我这么懒,肯定是不 ...

  6. Caused by: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 5000

    org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'brandControl ...

  7. apache kafka系列之在zookeeper中存储结构

    1.topic注册信息 /brokers/topics/[topic] : 存储某个topic的partitions所有分配信息 Schema:   {    "version": ...

  8. Kafka学习之(五)搭建kafka集群之Zookeeper集群搭建

    Zookeeper是一种在分布式系统中被广泛用来作为:分布式状态管理.分布式协调管理.分布式配置管理.和分布式锁服务的集群.kafka增加和减少服务器都会在Zookeeper节点上触发相应的事件kaf ...

  9. kafka集群与zookeeper集群 配置过程

    Kafka的集群配置一般有三种方法,即 (1)Single node – single broker集群: (2)Single node – multiple broker集群:    (3)Mult ...

随机推荐

  1. [转]docker 部署 mysql + phpmyadmin 3种方法

    本文转自:https://blog.csdn.net/Gekkoou/article/details/80897309 方法1: link # 创建容器 test-mysql (千万别用 mysql: ...

  2. 【转载】 Sqlserver查看数据库死锁的SQL语句

    在Sqlsever数据库中,有时候操作数据库过程中会进行锁表操作,在锁表操作的过程中,有时候会出现死锁的情况出现,这时候可以使用SQL语句来查询数据库死锁情况,主要通过系统数据库Master数据库来查 ...

  3. 在CentOS中部署.Net Core2.1网站

    作为一个刚接触linux的新手,在安装环境的时候,折腾了不少时间,写下一篇总结帖,帮助下新人吧~ 做完后再回来看步骤,也很简单,也就以下几步: 1.安装.Net Core环境 2.安装nginx实现端 ...

  4. C# winform自动更新 (附 demo下载)

    随着需求的变化,如果Server每次更新出新的内容,Client都要重新安装的话. 太过于复杂化.  所以自动更新是很有必要的. 一..NET自带的更新方式    以服务器端为主  (自动更新,微软爸 ...

  5. c# 创建,加载,修改XML文档

    using System.Xml.Linq; static void Main(string[] args) { XDocument xDocument = new XDocument(new XEl ...

  6. nginx配置反向代理和负载均衡

    一.反向代理 说明:应该有一个nginx服务器有多个应用服务器(可以是tomcat),本文使用一台虚拟机,安装一个nginx,多个tomcat,来模拟 upstream tomcats{ server ...

  7. CSS概念【记录】

    1.CSS语法 2.@规则 3.注释 4.层叠 5.优先级 6.继承 7.值 8.块格式化上下文 9.盒模型 10.层叠上下文 11.可替换元素 12.外边距合并 13.包含块 14.视觉格式化模型 ...

  8. Spring学习之旅(四)Spring工作原理再探

    上篇博文对Spring的工作原理做了个大概的介绍,想看的同学请出门左转.今天详细说几点. (一)Spring IoC容器及其实例化与使用 Spring IoC容器负责Bean的实例化.配置和组装工作有 ...

  9. java基础(五) String性质深入解析

    引言   本文将讲解String的几个性质. 一.String的不可变性   对于初学者来说,很容易误认为String对象是可以改变的,特别是+链接时,对象似乎真的改变了.然而,String对象一经创 ...

  10. SAP生产机该不该开放Debuger权限

    前段时间公司定制系统在调用SAP RFC接口的时候报错了,看错误消息一时半会儿也不知道是哪里参数数据错误,就想着进到SAP系统里面对这个接口做远程Debuger,跟踪一下参数变量的变化,结果发现根本就 ...