CAP理论中, P(partition tolerance, 分区容错性)的合理解释

在CAP理论中, 对partition tolerance分区容错性的解释一般指的是分布式网络中部分网络不可用时, 系统依然正常对外提供服务, 而传统的系统设计中往往将这个放在最后一位. 这篇文章对这个此进行了分析和重新定义, 并说明了在不同规模分布式系统中的重要性.

The ‘CAP’ theorem is a hot topic in the design of distributed data storage systems. However, it’s often widely misused. In this post I hope to highlight why the common ‘consistency, availability and partition tolerance: pick two’ formulation is inadequate for distributed systems. In fact, the lesson of the theorem is that the choice is almost always between sequential consistency and high availability.

It’s very common to invoke the ‘CAP theorem’ when designing, or talking about designing, distributed data storage systems. The theorem, as commonly stated, gives system designers a choice between three competing guarantees:

Consistency – roughly meaning that all clients of a data store get responses to requests that ‘make sense’. For example, if Client A writes 1 then 2 to location X, Client B cannot read 2 followed by 1.
Availability – all operations on a data store eventually return successfully. We say that a data store is ‘available’ for, e.g. write operations.
Partition tolerance – if the network stops delivering messages between two sets of servers, will the system continue to work correctly?

This is often summarised as a single sentence: “consistency, availability, partition tolerance. Pick two.”. Short, snappy and useful.

At least, that’s the conventional wisdom. Many modern distributed data stores, including those often caught under the ‘NoSQL’ net, pride themselves on offering availability and partition tolerance over strong consistency; the reasoning being that short periods of application misbehavior are less problematic than short periods of unavailability. Indeed, Dr. Michael Stonebraker posted an article on the ACM’s blog bemoaning the preponderance of systems that are choosing the ‘AP’ data point, and that consistency and availability are the two to choose. However for the vast majority of systems, I contend that the choice is almost always between consistency and availability, and unavoidably so.

Dr. Stonebraker’s central thesis is that, since partitions are rare, we might simply sacrifice ‘partition-tolerance’ in favour of sequential consistency and availability – a model that is well suited to traditional transactional data processing and the maintainance of the good old ACID invariants of most relational databases. I want to illustrate why this is a misinterpretation of the CAP theorem.

We first need to get exactly what is meant by ‘partition tolerance’ straight. Dr. Stonebraker asserts that a system is partition tolerant if processing can continue in both partitions in the case of a network failure.

“If there is a network failure that splits the processing nodes into two groups that cannot talk to each other, then the goal would be to allow processing to continue in both subgroups.”

This is actually a very strong partition tolerance requirement. Digging into the history of the CAP theorem reveals some divergence from this definition.

Seth Gilbert and Professor Nancy Lynch provided both a formalisation and a proof of the CAP theorem in their 2002 SIGACT paper. We should defer to their definition of partition tolerance – if we are going to invoke CAP as a mathematical truth, we should formalize our foundations, otherwise we are building on very shaky ground. Gilbert and Lynch define partition tolerance as follows:

“The network will be allowed to lose arbitrarily many messages sent from one node to another”

网络允许节点间通讯时丢失任意多的消息

Note that Gilbert and Lynch’s definition isn’t a property of a distributed application, but a property of the network in which it executes. This is often misunderstood: partition tolerance is not something we have a choice about designing into our systems. If you have a partition in your network, you lose either consistency (because you allow updates to both sides of the partition) or you lose availability (because you detect the error and shutdown the system until the error condition is resolved). Partition tolerance means simply developing a coping strategy by choosing which of the other system properties to drop. This is the real lesson of the CAP theorem – if you have a network that may drop messages, then you cannot have both availability and consistency, you must choose one. We should really be writing Possibility of Network Partitions => not(availability and consistency), but that’s not nearly so snappy.

Dr. Stonebraker’s definition of partition tolerance is actually a measure of availability – if a write may go to either partition, will it eventually be responded to? This is a very meaningful question for systems distributed across many geographic locations, but for the LAN case it is less common to have two partitions available for writes. However, it is encompassed by the requirement for availability that we already gave – if your system is available for writes at all times, then it is certainly available for writes during a network partition.

So what causes partitions? Two things, really. The first is obvious – a network failure, for example due to a faulty switch, can cause the network to partition. The other is less obvious, but fits with the definition from Gilbert and Lynch: machine failures, either hard or soft. In an asynchronous network, i.e. one where processing a message could take unbounded time, it is impossible to distinguish between machine failures and lost messages. Therefore a single machine failure partitions it from the rest of the network. A correlated failure of several machines partitions them all from the network. Not being able to receive a message is the same as the network not delivering it. In the face of sufficiently many machine failures, it is still impossible to maintain availability and consistency, not because two writes may go to separate partitions, but because the failure of an entire ‘quorum’ of servers may render some recent writes unreadable.

This is why defining P as ‘allowing partitioned groups to remain available’ is misleading – machine failures are partitions, almost tautologously, and by definition cannot be available while they are failed. Yet, Dr. Stonebraker says that he would suggest choosing CA rather than P. This feels rather like we are invited to both have our cake and eat it. Not ‘choosing’ P is analogous to building a network that will never experience multiple correlated failures. This is unreasonable for a distributed system – precisely for all the valid reasons that are laid out in the CACM post about correlated failures, OS bugs and cluster disasters – so what a designer has to do is to decide between maintaining consistency and availability. Dr. Stonebraker tells us to choose consistency, in fact, because availability will unavoidably be impacted by large failure incidents. This is a legitimate design choice, and one that the traditional RDBMS lineage of systems has explored to its fullest, but it implicitly protects us neither from availability problems stemming from smaller failure incidents, nor from the high cost of maintaining sequential consistency.

When the scale of a system increases to many hundreds or thousands of machines, writing in such a way to allow consistency in the face of potential failures can become very expensive (you have to write to one more machine than failures you are prepared to tolerate at once). This kind of nuance is not captured by the CAP theorem: consistency is often much more expensive in terms of throughput or latency to maintain than availability. Systems such as ZooKeeper are explicitly sequentially consistent because there are few enough nodes in a cluster that the cost of writing to quorum is relatively small. The Hadoop Distributed File System (HDFS) also chooses consistency – three failed datanodes can render a file’s blocks unavailable if you are unlucky. Both systems are designed to work in real networks, however, where partitions and failures will occur*, and when they do both systems will become unavailable, having made their choice between consistency and availability. That choice remains the unavoidable reality for distributed data stores.

From: http://blog.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/ .

下面说说我对CAP的理解:
1. A, 可用性, 主要是在高负载下的可用性, 以及低延迟响应. 这个在当前的系统设计中是排在第一位的, 尽量保证服务不会失去响应
2. C. 一致性, 强一致性, 或是时序一致性, 或是滞后的最终一致性. 分别代表了系统需要保障A和P的能力时, 在一致性上的妥协.
3. P. 容错性, 在节点间通信失败时保证系统不受影响. 对容错的要求提高会降低对可用性或一致性的期望, 要么停止系统用于错误恢复, 要么继续服务但是降低一致性

CAP理论中, P(partition tolerance, 分区容错性)的合理解释的更多相关文章

CAP理论中的P到底是个什么意思
在CAP理论中,C代表一致性,A代表可用性(在一定时间内,用户的请求都会得到应答),P代表分区容错.这里分区容错到底是指数据上的多个备份还是说其它的 ? 我感觉分布式系统中,CAP理论应该是C和A存在 ...
从CAP理论中分析Eureka与zookeeper的区别
著名的CAP理论指出,一个分布式系统不可能同时满足C(一致性).A(可用性)和P(分区容错性).由于分区容错性在是分布式系统中必须要保证的,因此我们只能在A和C之间进行权衡.在此Zookeeper保证 ...
CAP Confusion: Problems with ‘partition tolerance’
by Henry Robinson, April 26, 2010 The 'CAP' theorem is a hot topic in the design of distributed data ...
转载：分布式系统的CAP理论
原文转载Hollis原创文章:http://www.hollischuang.com/archives/666 2000年7月,加州大学伯克利分校的Eric Brewer教授在ACM PODC会议上提 ...
分布式领域CAP理论
分布式领域CAP理论,Consistency(一致性), 数据一致更新,所有数据变动都是同步的Availability(可用性), 好的响应性能Partition tolerance(分区容错性) 可 ...
数据库ACID和CAP理论
1.ACID是RDBMS的理论基石: A原子(Atomiclty )事务原子性: C一致(Consistency)插入一张表数据,会影响其它(索引/其它表)等一致. I ...
CAP理论和Base理论
CAP理论 Consistency(一致性), 数据一致更新,所有数据变动都是同步的 Availability(可用性), 好的响应性能 Partition tolerance(分区容错性) 可靠性, ...
分布式系统的CAP理论
一.CAP理论概述一个分布式系统最多只能同时满足一致性(Consistency).可用性(Availability)和分区容错性(Partition tolerance)这三项中的两项. 二.CAP ...
【D】分布式系统的CAP理论
2000年7月,加州大学伯克利分校的Eric Brewer教授在ACM PODC会议上提出CAP猜想.2年后,麻省理工学院的Seth Gilbert和Nancy Lynch从理论上证明了CAP.之后, ...

随机推荐

ORM(Object-Relational Mapping 对象关系映射)如何实现（转）
原文链接:http://blog.163.com/hzd_love/blog/static/13199988120107891854473/ 1.什么是ORM ORM的全称是Object Relati ...
GraphX中Pregel单源点最短路径（转）
原文链接:GraphX中Pregel单源点最短路径 GraphX中的单源点最短路径例子,使用的是类Pregel的方式. 核心部分是三个函数: 1.节点处理消息的函数 vprog: (VertexId ...
cesium and three.js【转】
https://blog.csdn.net/zhishiqu/article/details/79077883 这是威尔逊Muktar关于整合Three.js与铯的客人帖子.Three.js是一个轻量 ...
SharePoint 2013 基于表单 Membership 的身份验证
其实关于SharePoint 2013 表单身份验证网上已经有很多了,比如SharePoint 2013自定义Providers在基于表单的身份验证(Forms-Based-Authenticatio ...
硬链接(hard link)和符号连接(symbolic link)
inode ====== 在Linux系统中,内核为每一个新创建的文件分配一个inode,每个文件都有一个惟一的inode号,我们可以将inode简单理解成一个指针,它永远指向本文件的具体存储位置.文 ...
C# 简单日志文本输出
第一种直接文件IO流写日志文件 using System.IO; public static void WriteLog(string strLog) { string sFilePath=&qu ...
vue嵌套路由总结
嵌套路由就是在一个被路由过来的页面下可以继续使用路由,嵌套也就是路由中的路由的意思. 比如在vue中,我们如果不使用嵌套路由,那么只有一个<router-view>,但是如果使用,那么在一 ...
sql-获取指定年份指定月份的天数
declare @年月 varchar(6) set @年月= '201803' --查询2015年2月有多少天 select day(dateadd(month,1,@年月+ '01 ')-1)
ASP.NET MVC 重命名[命名空间]而导致的错误及发现的ASP.NET MVC Bug一枚
使用VS2012新建了一个Asp.net mvc5的项目,并把项目的命名空间名称更改了(Src更改为UXXXXX),然后就导致了以下错误刚开始以后是项目的属性中的命名空间没有更改过来的问题,但我在重 ...
Direct2D教程II——绘制基本图形和线型（StrokeStyle）的设置详解
目前,在博客园上,相对写得比较好的两个关于Direct2D的教程系列,分别是万一的Direct2D系列和zdd的Direct2D系列.有兴趣的网友可以去看看.本系列也是介绍Direct2D的教程,是基 ...

CAP理论中, P(partition tolerance, 分区容错性)的合理解释

CAP理论中, P(partition tolerance, 分区容错性)的合理解释的更多相关文章

随机推荐

热门专题