原创链接:https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab+vs.+Paxos

Is Zab just a special implementation of Paxos?

No, Zab is a different protocol than Paxos, although it shares with it some key aspects, as for example:

  • A leader proposes values to the followers
  • Leaders wait for acknowledgements from a quorum of followers before considering a proposal committed (learned)
  • Proposals include epoch numbers, which are similar to ballot numbers in Paxos

The main conceptual difference between Zab and Paxos is that it is primarily designed for primary-backup systems, like Zookeeper, rather than for state machine replication.

What is the difference between primary-backup and state machine replication?

A state machine is a software component that processes a sequence of requests. For every processed request, it can modify its internal state and produce a reply. A state machine is deterministic in the sense that, given two runs where it receives the same sequence of requests, it always makes the same internal state transitions and produces the same replies.

A state machine replication system is a client-sever system ensuring that each state machine replica executes the same sequence of client requests, even if these requests are submitted concurrently by clients and received in different orders by the replicas. Replicas agree on the execution order of client requests using a consensus algorithm like Paxos. Client requests that are sent concurrently and overlap in time can be executed in any order. If a leader fails, a new leader that executes recovery is free to arbitrarily reorder any uncommitted request since it is not yet completed.

In the case of primary-backup systems, such as Zookeeper, replicas agree on the application order of incremental (delta) state updates, which are generated by a primary replica and sent to its followers. Unlike client requests, state updates must be applied in the exact original generation order of the primary, starting from the original initial state of the primary. If a primary fails, a new primary that executes recovery cannot arbitrarily reorder uncommitted state updates, or apply them starting from a different initial state.

In conclusion, agreement on state updates (for primary-backup systems) requires stricter ordering guarantees than agreement on client requests (for state machine replication systems).

What are the implications for agreement algorithms?

Paxos can be used for primary-backup replication by letting the primary be the leader. The problem with Paxos is that, if a primary concurrently proposes multiple state updates and fails, the new primary may apply uncommitted updates in an incorrect order. An example is presented in our DSN 2011 paper(Figure 1). In the example, a replica should only apply the state update B after applying A. The example shows that, using Paxos, a new primary and its follows may apply B after C, reaching an incorrect state that has not been reached by any of the previous primaries.

A workaround to this problem using Paxos is to sequentially agree on state updates: a primary proposes a state update only after it commits all previous state updates. Since there is at most one uncommitted update at a time, a new primary cannot incorrectly reorder updates. This approach, however, results in poor performance.

Zab does not need this workaround. Zab replicas can concurrently agree on the order of multiple state updates without harming correctness. This is achieved by adding one more synchronization phase during recovery compared to Paxos, and by using a different numbering of instances based on zxids.

Want to know more?

Have a look at our DSN 2011 paper, or contact us!

(转载)Zab vs. Paxos的更多相关文章

  1. ZAB与Paxos算法的联系与区别

    ZAB协议并不是Paxos算法的一个典型实现,在讲解ZAB和Paxos之间的区别之前,我们首先来看下两者的联系. 两者都存在一个类似于Leader进程的角色,由其负责协调多个Follow进程的运行. ...

  2. 3. ZAB与Paxos算法的联系与区别。

    转自:https://blog.csdn.net/en_joker/article/details/78665809 ZAB协议并不是Paxos算法的一个典型实现,在讲解ZAB和Paxos之间的区别之 ...

  3. ZAB 和 Paxos 算法的联系与区别?

    相同点: 1.两者都存在一个类似于 Leader 进程的角色,由其负责协调多个 Follower 进程的运行 2.Leader 进程都会等待超过半数的 Follower 做出正确的反馈后,才会将一个提 ...

  4. ZAB协议和Paxos算法

    前言在上一篇文章Paxos算法浅析中主要介绍了Paxos一致性算法应用的场景,以及对协议本身的介绍:Google Chubby是一个分布式锁服务,其底层一致性实现就是以Paxos算法为基础的:但这篇文 ...

  5. paxos(chubby) vs zab(Zookeeper)

    参考: Zookeeper的一致性协议:Zab Chubby&Zookeeper原理及在分布式环境中的应用 Paxos vs. Viewstamped Replication vs. Zab ...

  6. ZAB协议与Paxos算法

    ZooKeeper并没有直接采用Paxos算法,而是采用一种被称为ZAB(ZooKeeper Atomic Broadcast)的一致性协议 ZooKeeper是一个典型的分布式数据一致性的解决方案, ...

  7. Paxos、ZAB、RAFT协议

    这三个都是分布式一致性协议,ZAB基于Paxos修改后用于ZOOKEEPER协议,RAFT协议出现在ZAB协议之后,与ZAB差不多,也有很大区别. 1. Paxos 分布式节点分为3种角色, Prop ...

  8. Zookeeper协议篇-Paxos算法与ZAB协议

    前言 可以自行去学习一下Zookeeper中的系统模型,节点特性,权限认证以及事件通知Watcher机制相关知识,本篇主要学习Zookeeper一致性算法和满足分布式协调的Zab协议 Paxos算法 ...

  9. 分布式技术专题-分布式协议算法-带你彻底认识Paxos算法、Zab协议和Raft协议的原理和本质

    内容简介指南 Paxo算法指南 Zab算法指南 Raft算法指南 Paxo算法指南 Paxos算法的背景 [Paxos算法]是莱斯利·兰伯特(Leslie Lamport)1990年提出的一种基于消息 ...

随机推荐

  1. position属性详解

    内容: 1.position属性介绍 2.position属性分类 3.relative相对定位 4.absolute绝对定位 5.relative和absolute联合使用进行定位 6.fixed固 ...

  2. 关于json 转换BigDecimal精度丢失问题

    今天在转换一个关于金额字段发现一个关于json转换的bug  目前尚未深入观察 问题: 如果金钱为bigdecimal json转换后不会丢失精度 但是通过@responsebody 返回到前端后发现 ...

  3. 技术思维VS管理思维

    以下为技术思维与管理思维的不同 在日常的工作中,会出现身兼两职 开发和项目经理 的情况,在此就要学会游刃有余的切换角色,方能一人分身二角 角色转换本质上是思维转换.思维决定一个人的行为,项目经理不像项 ...

  4. rsyslog编译依赖问题解决

    bit-32-centos6.4测试loganalyzer+mysql+rsyslog   web界面中央日志分析系统.1,报json-c错误wget http://cloud.github.com/ ...

  5. 理解无偏估计(unbiased estimation)

    判断一个估计量“好坏”,至少可以从以下三个方面来考虑: 无偏估计 有效性 一致性 参考内容: 如何理解无偏估计量?https://www.matongxue.com/madocs/808.html 衡 ...

  6. LSTM(Long Short-Term Memory)长短期记忆网络

    1. 摘要 对于RNN解决了之前信息保存的问题,例如,对于阅读一篇文章,RNN网络可以借助前面提到的信息对当前的词进行判断和理解,这是传统的网络是不能做到的.但是,对于RNN网络存在长期依赖问题,比如 ...

  7. Requests库入门

    安装: $ pip install requests Response对象的一些基本属性: Response.status_code 请求的返回状态,正常为200 Response.text 页面的字 ...

  8. oracle跟踪sql语句

    oracle跟踪sql语句 select * from v$sql 查询客户端电脑名称的ID select terminal, SID,SERIAL#  from v$session where  ( ...

  9. IP Editor IP控件

    HWND hIpEdit; void __fastcall TForm2::FormCreate(TObject *Sender) { hIpEdit = CreateWindow(WC_IPADDR ...

  10. time 时间内置模块3种形态的转化

    import time print(time.time())  #获得时间戳 1526998642.877814 print(time.sleep(2))  #停止2秒 print(time.gmti ...