zookeeper一般用于distributed locking,并不适合用于distributed storage,由于zookeeper的每一个node。也叫做znode的存储容量限制是1M。

zookeeper里的角色主要有client,leader和learner。当中learner也包含observer和follower。

client为请求的发起方,follower为请求的接收方,同一时候也会返回结果。參与投票过程

leader负责投票的发起和决策,更新系统状态

observer不參加投票。仅仅同步leader状态。它可接受client连接并将写请求转发给leader。observer是为了扩展系统。提高吞吐速度。

zookeeper的架构本身和传统的文件系统(file system)非常相似,不一样的是对于zookeeper每一个node上都能存1M数据

zookeeper主要用于存储协调信息(coordination data),比如status information, configuration, location information

因为zookeeper是in-memory storage,所以zookeeper能够实现high throughput和low latency.

參考文献: zookeeper官方wiki,以下摘抄一段overview

ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers (we call these registers znodes), much like a file system. Unlike normal file systems ZooKeeper provides its clients with high throughput,
low latency, highly available, strictly ordered access to the znodes. The performance aspects of ZooKeeper allow it to be used in large distributed systems. The reliability aspects prevent it from becoming the single point of failure in big systems. Its strict
ordering allows sophisticated synchronization primitives to be implemented at the client.

The name space provided by ZooKeeper is much like that of a standard file system. A name is a sequence of path elements separated by a slash ("/"). Every znode in ZooKeeper's name space is identified by a path. And every znode has a parent whose path is a prefix
of the znode with one less element; the exception to this rule is root ("/") which has no parent. Also, exactly like standard file systems, a znode cannot be deleted if it has any children.

The main differences between ZooKeeper and standard file systems are that every znode can have data associated with it (every file can also be a directory and vice-versa) and znodes are limited to the amount of data that they can have. ZooKeeper was designed
to store coordination data: status information, configuration, location information, etc. This kind of meta-information is usually measured in kilobytes, if not bytes. ZooKeeper has a built-in sanity check of 1M, to prevent it from being used as a large data
store, but in general it is used to store much smaller pieces of data.

The service itself is replicated over a set of machines that comprise the service. These machines maintain an in-memory image of the data tree along with a transaction logs and snapshots in a persistent store. Because the data is kept in-memory, ZooKeeper is
able to get very high throughput and low latency numbers. The downside to an in-memory database is that the size of the database that ZooKeeper can manage is limited by memory. This limitation is further reason to keep the amount of data stored in znodes small.

The servers that make up the ZooKeeper service must all know about each other. As long as a majority of the servers are available, the ZooKeeper service will be available. Clients must also know the list of servers. The clients create a handle to the ZooKeeper
service using this list of servers.

Clients only connect to a single ZooKeeper server. The client maintains a TCP connection through which it sends requests, gets responses, gets watch events, and sends heartbeats. If the TCP connection to the server breaks, the client will connect to a different
server. When a client first connects to the ZooKeeper service, the first ZooKeeper server will setup a session for the client. If the client needs to connect to another server, this session will get reestablished with the new server.

Read requests sent by a ZooKeeper client are processed locally at the ZooKeeper server to which the client is connected. If the read request registers a watch on a znode, that watch is also tracked locally at the ZooKeeper server. Write requests are forwarded
to other ZooKeeper servers and go through consensus before a response is generated. Sync requests are also forwarded to another server, but do not actually go through consensus. Thus, the throughput of read requests scales with the number of servers and the
throughput of write requests decreases with the number of servers.

Order is very important to ZooKeeper; almost bordering on obsessive–compulsive disorder. All updates are totally ordered. ZooKeeper actually stamps each update with a number that reflects this order. We call this number the zxid (ZooKeeper Transaction Id).
Each update will have a unique zxid. Reads (and watches) are ordered with respect to updates. Read responses will be stamped with the last zxid processed by the server that services the read.

zookeeper工作原理解析的更多相关文章

  1. Zookeeper 3、Zookeeper工作原理(详细)

    1.Zookeeper的角色 » 领导者(leader),负责进行投票的发起和决议,更新系统状态 » 学习者(learner),包括跟随者(follower)和观察者(observer),follow ...

  2. Zookeeper 3、Zookeeper工作原理(转)

    1.Zookeeper的角色 » 领导者(leader),负责进行投票的发起和决议,更新系统状态 » 学习者(learner),包括跟随者(follower)和观察者(observer),follow ...

  3. zookeeper工作原理、安装配置、工具命令简介

    1.Zookeeper简介 Zookeeper 是分布式服务框架,主要是用来解决分布式应用中经常遇到的一些数据管理问题,如:统一命名服务.状态同步服务.集群管理.分布式应用配置项的管理等等. 2.zo ...

  4. [转载] zookeeper工作原理、安装配置、工具命令简介

    转载自http://www.cnblogs.com/kunpengit/p/4045334.html 1 Zookeeper简介Zookeeper 是分布式服务框架,主要是用来解决分布式应用中经常遇到 ...

  5. 分布式协调服务ZooKeeper工作原理

    分布式协调服务ZooKeeper工作原理 原创 2016-02-19 杜亦舒 性能与架构 性能与架构 性能与架构 微信号 yogoup 功能介绍 网站性能提升与架构设计 大数据处理框架Hadoop.R ...

  6. jdk线程池ThreadPoolExecutor工作原理解析(自己动手实现线程池)(一)

    jdk线程池ThreadPoolExecutor工作原理解析(自己动手实现线程池)(一) 线程池介绍 在日常开发中经常会遇到需要使用其它线程将大量任务异步处理的场景(异步化以及提升系统的吞吐量),而在 ...

  7. Servlet 工作原理解析

    转自:http://www.ibm.com/developerworks/cn/java/j-lo-servlet/ Web 技术成为当今主流的互联网 Web 应用技术之一,而 Servlet 是 J ...

  8. Zookeeper工作原理一

    ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,它包含一个简单的原语集,分布式应用程序可以基于它实现同步服务,配置维护和命名服务等.Zookeeper是hadoop的一个子项目,其 ...

  9. Zookeeper工作原理

    ZooKeeper是一个分布式的,开放源码的分布式应用程序协调服务,它包含一个简单的原语集,分布式应用程序可以基于它实现同步服务,配置维护和命名服务等.Zookeeper是hadoop的一个子项目,其 ...

随机推荐

  1. Syncthing vs BitTorrent Sync

    Syncthing 是一款跨平台的文件同步工具.即你在一台设备上创建.修改或删除文件,在其他设备上会同步执行相同的操作.Syncthing 不会将你的数据上传到云端,而是在你的多台设备同时在线时对指定 ...

  2. linux内核源码之基础准备篇

    http://blog.csdn.net/eastmoon502136/article/details/8711104

  3. 提交改动到 github 远程服务器,怎么跳过要求输入密码的步骤

    新机器上将工程改动提交到 github 服务器时,发现每次都要输入密码,这个有点儿小烦人,怎么解决这个问题呢? 首先,切换到工程根目录的 .git 隐藏目录,用 TextEdit 打开 config ...

  4. lol匹配算法

    这是Riot的Design Director Tom Cadwell专门为中国玩家写的解说匹配系统工作原理的帖子. 同一时候为了让大家更好的理解匹配系统,假设您认为您遇到了特别不公平的匹配,请回复游戏 ...

  5. ubuntu 添加CDROM安装源

    国内私募机构九鼎控股打造APP,来就送 20元现金领取地址:http://jdb.jiudingcapital.com/phone.html内部邀请码:C8E245J (不写邀请码,没有现金送)国内私 ...

  6. [翻译] INTERACTIVE TRANSITIONS 实时动态动画

    INTERACTIVE TRANSITIONS 实时动态动画 翻译不到位处敬请谅解,感谢原作者分享精神 原文链接 http://www.thinkandbuild.it/interactive-tra ...

  7. Solution Explorer 和 Source Control Explorer 的 View History 异同

    如果查看别人对代码的修改,你可能会非常烦恼与在 Solution Explorer 中看历史版本看不全,如下: 实际上,你想看到的是对于整个解决方案,全部的历史版本,那应该跑去 Source Cont ...

  8. Git 学习(八)其他

    Git 学习(八)其他 通过以上七章Git的学习,基本操作已差不多了,本章介绍一点落网之鱼:  包括如何忽略文件.配置别名.以及使用GitHub等. 当然,Git的强大远不是七章内容可概括的,之后可结 ...

  9. MyBatis两张表字段名相同产生的问题

    MyBatis两张表字段名相同, 会导致bean属性都映射为第一个表的列, 解决方法: 通过设置别名的方式让其产生区别,如 <select id="queryBySekillId&qu ...

  10. 超酷的响应式dribbble设计作品瀑布流布局效果

    相信做设计的朋友肯定都知道dribbble.com,它是一个非常棒的设计师分享作品的网站,全世界数以万计的设计高手和行家都在这个网站上分享自己的作品,当然,如果你常在上面闲逛的话,经常得到一些免费的好 ...