The Many Faces of Consistency
The notion of consistency is used across different computer science disciplines from distributed systems to database systems to computer architecture. It turns out that consistency can mean quite different things across these disciplines, depending on who uses it and in what context it appears. We identify two broad types of consistency, state consistency and operation consistency*, which differ fundamentally in meaning and scope. We explain how these types map to the many examples of consistency in each discipline.*
一致性的概念用于不同的计算机科学学科,从分布式系统到数据库系统再到计算机体系结构。 事实证明,一致性在这些学科中可能意味着完全不同的事情,具体取决于谁使用它以及它出现在什么背景下。 我们确定了两大类一致性:状态一致性和操作一致性,它们在含义和范围上有根本的不同。 我们解释这些类型如何映射到每个学科中的许多一致性示例。
1 Introduction
Consistency is an important consideration in computer systems that share and replicate data. Whereas early computing systems had private data exclusively, shared data has become increasingly common as computers have evolved from calculating machines to tools of information exchange. Shared data occurs in many types of systems, from distributed systems to database systems to multiprocessor systems. For example, in distributed systems, users across the network share files (e.g., source code), network names (e.g., DNS entries), data blobs (e.g., images in a key-value store), or system metadata (e.g., configuration information). In database systems, users share tables containing account information, product descriptions, flight bookings, and seat assignments. Within a computer, processor cores share cache lines and physical memory.
一致性是共享和复制数据的计算机系统中的一个重要考虑因素。 早期的计算系统只拥有私有数据,而随着计算机从计算机器发展为信息交换工具,共享数据变得越来越普遍。 共享数据出现在许多类型的系统中,从分布式系统到数据库系统再到多处理器系统。 例如,在分布式系统中,网络上的用户共享文件(例如源代码)、网络名称(例如 DNS 条目)、数据 blob(例如键值存储中的图像)或系统元数据(例如配置) 信息)。 在数据库系统中,用户共享包含帐户信息、产品描述、航班预订和座位分配的表。 在计算机内,处理器核心共享高速缓存行和物理内存。
In addition to sharing, computer systems increasingly replicate data within and across components. In distributed systems, each site may hold a local replica of files, network names, blobs, or system metadata— these replicas, called caches, increase performance of the system. Database systems also replicate rows or tables for speed or to tolerate disasters. Within a computer, parts of memory are replicated at various points in the cache hierarchy (l1, l2, l3 caches), again for speed. We use the term replica broadly to mean any copies of the data maintained by the system.
除了共享之外,计算机系统还越来越多地在组件内部和组件之间复制数据。 在分布式系统中,每个站点都可以保存文件、网络名称、blob 或系统元数据的本地副本 - 这些副本(称为缓存)可以提高系统的性能。 数据库系统还复制行或表以提高速度或容忍灾难。 在计算机内,部分内存会在缓存层次结构(l1、l2、l3 缓存)中的各个点进行复制,同样是为了提高速度。 我们广泛使用术语“副本”来表示系统维护的数据的任何副本。
In all these systems, data sharing and replication raise a fundamental question: what should happen if a client modifies some data items and simultaneously, or within a short time, another client reads or modifies the same items, possibly at a different replica?
在所有这些系统中,数据共享和复制提出了一个基本问题:如果一个客户端修改了某些数据项,并且同时或在短时间内,另一个客户端可能在不同的副本上读取或修改了相同的项目,那么会发生什么情况?
This question does not have a single answer that is right in every context. A consistency property governs the possible outcomes by limiting how data can change or what clients can observe in each case. For example, with DNS, a change to a domain may not be visible for hours; the only guarantee is that updates will be seen eventually—an example of a property called eventual consistency [23]. But with flight seat assignments, updates must be immediate and mutually exclusive, to ensure that no two passengers receive the same seat—an example of a strong type of consistency provided by serializability [5]. Other consistency properties include causal consis- tency [13], read-my-writes [21], bounded staleness [1], continuous consistency [1, 25], release consistency [10], fork consistency [16], epsilon serializability [18], and more.
这个问题没有一个在所有情况下都正确的答案。 一致性属性通过限制数据的更改方式或客户端在每种情况下可以观察到的内容来控制可能的结果。 例如,对于 DNS,对域的更改可能在数小时内不可见; 唯一的保证是最终会看到更新——一个称为最终一致性的属性的例子[23]。 但对于航班座位分配,更新必须是立即且互斥的,以确保没有两名乘客获得相同的座位——这是可串行性提供的强一致性的一个例子 [5]。 其他一致性属性包括因果一致性 [13]、读我的写 [21]、有界陈旧性 [1]、连续一致性 [1, 25]、发布一致性 [10]、分叉一致性 [16]、epsilon 可串行性 [ 18]等等。
Consistency is important because developers must understand the answer to the above fundamental question. This is especially true when the clients interacting with the system are not humans but other computer programs that must be coded to deal with all possible outcomes.
一致性很重要,因为开发人员必须理解上述基本问题的答案。 当与系统交互的客户端不是人类而是必须编码以处理所有可能结果的其他计算机程序时尤其如此。
In this article, we examine many examples of how consistency is used across three computer science disci- plines: distributed systems, database systems, and computer architecture. We find that the use of consistency varies significantly across these disciplines. To bring some clarity, we identify two fundamentally different types of consistency: state consistency and operation consistency>. State consistency concerns the state of the system and establishes constraints on the allowable relationships between different data items or different replicas of the same items. For instance, state consistency might require that two replicas store the same value when updates are not outstanding. Operation consistency concerns operations on the system and establishes constraints on what results they may return. For instance, operation consistency might require that a read of a file reflects the contents of the most recent write on that file. State consistency tends to be simpler and application dependent, while operation consistency tends to be more complex and application agnostic. Both types of consistency are important and, in our opinion, our communities should more clearly disentangle them.
在本文中,我们研究了如何在三个计算机科学学科(分布式系统、数据库系统和计算机体系结构)中使用一致性的许多示例。 我们发现,在这些学科中,一致性的使用存在显着差异。 为了清楚起见,我们确定了两种根本不同类型的一致性:状态一致性和操作一致性。 状态一致性涉及系统的状态,并对不同数据项或相同项的不同副本之间允许的关系建立约束。 例如,状态一致性可能要求两个副本在更新未完成时存储相同的值。 操作一致性涉及系统上的操作,并对它们可能返回的结果建立约束。 例如,操作一致性可能要求文件的读取反映该文件最近写入的内容。 状态一致性往往更简单且依赖于应用程序,而操作一致性往往更复杂且与应用程序无关。 这两种类型的一致性都很重要,我们认为,我们的社区应该更清楚地理清它们。
While this article discusses different forms of consistency, it focuses on the semantics of consistency rather than the mechanisms of consistency. Semantics refer to what consistency properties the system provides, while mechanisms refer to how the system enforces those properties. Semantics and mechanisms are closely related but it is important to understand the former without needing to understand the latter.
虽然本文讨论了不同形式的一致性,但它重点关注一致性的语义而不是一致性的机制。 语义是指系统提供哪些一致性属性,而机制是指系统如何强制执行这些属性。 语义和机制密切相关,但重要的是理解前者而不需要理解后者。
The rest of this article is organized as follows. We first explain the abstract system model and terminology used throughout the article in Section 2. We present the two types of consistency and their various embodiments in Section 3. We indicate how these consistency types occur across different disciplines in Section 4.
本文的其余部分组织如下。 我们首先在第 2 节中解释整篇文章中使用的抽象系统模型和术语。我们在第 3 节中介绍了两种类型的一致性及其各种实施例。我们在第 4 节中指出了这些一致性类型如何在不同学科中发生。
2 Abstract model
We consider a setting with multiple clients that submit operations to be executed by the system. Clients could be human users, computer programs, or other systems that do not concern us. Operations might include simple read and write, read-modify-write, start and commit a transaction, and range queries. Operations typically act on data items, which could be blocks, files, key-value pairs, DNS entries, rows of tables, memory locations, and so on.
我们考虑具有多个客户端的设置,这些客户端提交要由系统执行的操作。 客户端可以是人类用户、计算机程序或与我们无关的其他系统。 操作可能包括简单的读取和写入、读取-修改-写入、启动和提交事务以及范围查询。 操作通常作用于数据项,这些数据项可以是块、文件、键值对、DNS 条目、表行、内存位置等。
The system has a state, which includes the current values of the data items. In some cases, we are interested in the consistency of client caches and other replicas. In these cases, the caches and other replicas are considered to be part of the system and the system state includes their contents.
系统具有状态,其中包括数据项的当前值。 在某些情况下,我们对客户端缓存和其他副本的一致性感兴趣。 在这些情况下,缓存和其他副本被视为系统的一部分,并且系统状态包括它们的内容。
An operation execution is not instantaneous; rather, it starts when a client submits the operation, and it finishes when the client obtains its response from the system. If the operation execution returns no response, then it finishes when the system is no longer actively processing it.
操作执行不是即时的; 相反,它在客户端提交操作时开始,在客户端从系统获得响应时结束。 如果操作执行没有返回响应,则当系统不再主动处理它时,操作就会完成。
Operations are distinct from operation executions. Operations are static and a system has relatively few of them, such as read and write. Operation executions, on the other hand, are dynamic and numerous. A client can execute the same operation many times, but each operation execution is unique. While technically we should separate operations from operation executions, we often blur the distinction when it is clear from the context (e.g., we might say that the read operation finishes, rather than the execution of the read operation finishes).
操作与操作执行不同。 操作是静态的,系统中的操作相对较少,例如读和写。 另一方面,操作执行是动态的且数量众多。 一个客户端可以多次执行相同的操作,但每次操作的执行都是唯一的。 虽然从技术上讲,我们应该将操作与操作执行分开,但当上下文清楚时,我们经常会模糊区别(例如,我们可能会说读操作完成,而不是读操作的执行完成)。
3 Two types of consistency
We are interested in what happens when shared and replicated data is accessed concurrently or nearly con- currently by many clients. Generally speaking, consistency places constraints on the allowable outcomes of operations, according to the needs of the application. We now define two broad types of consistency. One places constraints on the state, the other on the results of operations.
我们感兴趣的是当许多客户端同时或几乎同时访问共享和复制的数据时会发生什么。 一般来说,一致性根据应用程序的需要对操作的允许结果施加限制。 我们现在定义两种广泛的一致性类型。 一是对 状态施加限制,二是对操作结果施加限制。
3.1 State consistency
State consistency pertains to the state of the system; it consists of properties that users expect the state to satisfy despite concurrent access and the existence of multiple replicas. State consistency is also applicable when data can be corrupted by errors (crashes, bit flips, bugs, etc), but this is not the focus of this article.
状态一致性与系统的状态有关; 它由用户期望状态满足的属性组成,尽管存在并发访问和存在多个副本。 当数据可能因错误(崩溃、位翻转、错误等)而损坏时,状态一致性也适用,但这不是本文的重点。
State consistency can be of many subcategories, based on how the properties of state are expressed. We explain these subcategories next.
根据状态属性的表达方式,状态一致性可以分为许多子类别。 接下来我们将解释这些子类别。
3.1.1 Invariants 不变量
The simplest subcategory of state consistency is one defined by an invariant—a predicate on the state that must evaluate to true. For instance, in a concurrent program, a singly linked list must not contain cycles. In a multiprocessor system, if the local caches of two processors keep a value for some address, it must be the same value. In a social network, if user x is a friend of user y then y is a friend of x. In a photo sharing application, if a photo album includes an image then the image’s owner is the album.
状态一致性最简单的子类别是由不变量定义的,即状态上必须评估为 true 的谓词。 例如,在并发程序中,单链表不能包含循环。 在多处理器系统中,如果两个处理器的本地缓存保存某个地址的值,则该值必须是相同的值。 在社交网络中,如果用户 x 是用户 y 的朋友,则 y 也是 x 的朋友。 在照片共享应用程序中,如果相册包含图像,则该图像的所有者就是相册。
In database systems, two important examples are uniqueness constraints and referential integrity. A unique- ness constraint on a column of a table requires that each value appearing in that column must occur in at most one row. This property is crucial for the primary keys of tables.
在数据库系统中,两个重要的例子是唯一性约束和引用完整性。 表的列的唯一性约束要求该列中出现的每个值必须最多出现在一行中。 该属性对于表的主键至关重要。
Referential integrity concerns a table that refers to keys of another table. Databases may store relations between tables by including keys of a table within columns in another table. Referential integrity requires that the included keys are indeed keys in the first table. For instance, in a bank database, suppose that an accounts table includes a column for the account owner, which is a user id; meanwhile, the user id is the primary key in a users table, which has detailed information for each user. A referential integrity constraint requires that user ids in the accounts table must indeed exist in the users table.
参照完整性涉及引用另一个表的键的表。 数据库可以通过将一个表的键包含在另一个表的列中来存储表之间的关系。 参照完整性要求包含的键确实是第一个表中的键。 例如,在银行数据库中,假设帐户表包含帐户所有者的列,即用户 ID; 同时,用户id是用户表中的主键,该表包含每个用户的详细信息。 参照完整性约束要求帐户表中的用户 ID 必须确实存在于用户表中。
Another example of state consistency based on invariants is mutual consistency, used in distributed systems that are replicated using techniques such as primary-backup [2]. Mutual consistency requires that replicas have the same state when there are no outstanding updates. During updates, replicas may diverge temporarily since the updates are not applied simultaneously at all replicas.
基于不变量的状态一致性的另一个例子是相互一致性,用于使用主备份等技术进行复制的分布式系统[2]。 相互一致性要求副本在没有未完成的更新时具有相同的状态。 在更新期间,副本可能会暂时出现分歧,因为更新不会同时应用于所有副本。
3.1.2 Error bounds 误差范围
If the state contains numerical data, the consistency property could indicate a maximum deviation or error from the expected. For instance, the values at two replicas may diverge by at most ε. In an internet-of-things system, the reported value of a sensor, such as a thermometer, must be within ε from the actual value being measured. This example relates the state of the system to the state of the world. Error bounds were first proposed within the database community [1] and the basic idea was later revived in the distributed systems community [25].
如果状态包含数值数据,则一致性属性可能表示与预期的最大偏差或错误。 例如,两个副本上的值最多可能会差异ε。 在图Internet系统中,传感器的报告值(例如温度计)必须在ε内,从实际的值中。 这个示例将系统状态与世界状态联系起来。 误差界最初是在数据库社区中提出的[1],后来在分布式系统社区中恢复了基本思想[25]。
3.1.3 Limits on proportion of violations 违规比例限制
If there are many properties or invariants, it may be unrealistic to expect all of them to hold, but rather just a high percentage. For instance, the system may require that at most one user’s invariants are violated in a pool of a million users; this could make sense if the system can compensate a small fraction of users for inconsistencies in their data.
如果存在许多属性或不变量,则期望它们全部成立而只是高百分比成立可能是不现实的。 例如,系统可能要求在一百万个用户的池中最多违反一个用户的不变量; 如果系统可以补偿一小部分用户的数据不一致,那么这可能是有意义的。
3.1.4 Importance 重要性
Properties or invariants may be critical, important, advisable, desirable, or optional, where users expect only the critical properties to hold at all times. Developers can use more expensive and effective mechanisms for the more important invariants. For instance, when a user changes her password at a web site, the system might require all replicas of the user account to have the same password before acknowledging the change to the user. This property is implemented by contacting all replicas and waiting for replies, which can be an overly expensive mechanism for less important properties.
属性或不变量可能是关键的、重要的、可取的、理想的或可选的,其中用户期望始终只保留关键属性。 开发人员可以针对更重要的不变量使用更昂贵和更有效的机制。 例如,当用户在网站上更改密码时,系统可能要求该用户帐户的所有副本都具有相同的密码,然后再向用户确认更改。 该属性是通过联系所有副本并等待回复来实现的,对于不太重要的属性来说,这可能是一种过于昂贵的机制。
3.1.5 Eventual invariants 最终不变量
An invariant may need to hold only after some time has passed. For example, under eventual consistency, replicas need not be the same at all times, as long as they eventually become the same when updates stop occurring. This eventual property is appropriate because replicas may be updated in the background or using some anti-entropy mechanism, where it takes an indeterminate amount of time for a replica to receive and process an update. Eventual consistency was coined by the distributed systems community [23], though the database community previously proposed the idea of reconciling replicas that diverge during partitions [9].
不变量可能只需要在经过一段时间后才保持不变。 例如,在最终一致性下,副本不需要始终相同,只要当更新停止发生时它们最终变得相同即可。 此最终属性是合适的,因为副本可以在后台或使用某种反熵机制进行更新,其中副本接收和处理更新需要不确定的时间。 最终一致性是由分布式系统社区[23]创造的,尽管数据库社区之前提出了协调分区期间分歧的副本的想法[9]。
State consistency is limited to properties on state, but in many cases clients care less about the state and more about the results that they obtain from the system. In other words, what matters is the behavior that clients observe from interacting with the system. These cases call for a different form of consistency, which we discuss next.
状态一致性仅限于状态的属性,但在许多情况下,客户端不太关心状态,而更关心他们从系统获得的结果。 换句话说,重要的是客户通过与系统交互观察到的行为。 这些情况需要不同形式的一致性,我们接下来讨论。
3.2 Operation consistency
Operation consistency pertains to the operation executions by clients; it consists of properties that indicate whether operations return acceptable results. These properties can tie together many operation executions, as shown in the examples below.
操作一致性涉及客户端执行的操作; 它由指示操作是否返回可接受的结果的属性组成。 这些属性可以将许多操作执行联系在一起,如下面的示例所示。
Operation consistency has subcategories, with different ways to define the consistency property. We explain these subcategories next.
操作一致性有不同的子类别,有不同的方法来定义一致性属性。 接下来我们将解释这些子类别。
3.2.1 Sequential equivalence 顺序等价
This subcategory defines the permitted operation results of a concurrent execution in terms of the permitted oper- ation results in a sequential execution—one in which operations are executed one at a time, without concurrency. More specifically, there must be a way to take the execution of all operations submitted by any subset of clients, and then reduce them to a sequential execution that is correct. The exact nature of the reduction depends on the specific consistency property. Technically, the notion of a correct sequential execution is system dependent, so it needs to be specified as well, but it is often obvious and therefore omitted.
该子类别根据顺序执行(一次执行一个操作,无并发)中允许的操作结果来定义并发执行的允许操作结果。 更具体地说,必须有一种方法来执行任何客户端子集提交的所有操作,然后将它们简化为正确的顺序执行。 减少的确切性质取决于特定的一致性属性。 从技术上讲,正确顺序执行的概念取决于系统,因此也需要指定它,但它通常是显而易见的,因此被省略。
We now give some examples of sequential equivalence. 现在我们给出一些顺序等价的例子。
Linearizability [12] is a strong form of consistency. Intuitively, the constraint is that each operation must appear to occur at an instantaneous point between its start and finish times, where execution at these instanta- neous points form a valid sequential execution. More precisely, we define a partial order < from the concurrent execution, as follows: op1 < op2 iff op1 finishes before op2 starts. There must exist a legal total order T of all operations with their results, such that (1) T is consistent with <, meaning that if op1 < op2 then op1 appears before op2 in T, and (2) T defines a correct sequential execution. Linearizability has been traditionally used to define the correct behavior of concurrent data structures; more recently, it has also been used in distributed systems.
线性化[12]是一致性的一种强形式。 直观上,约束是每个操作必须出现在其开始时间和结束时间之间的瞬时点,其中在这些瞬时点的执行形成有效的顺序执行。 更准确地说,我们从并发执行中定义一个偏序 < ,如下所示: op1 < op2 iff op1 在 op2 开始之前完成。 所有操作及其结果必须存在合法的全序 T,使得 (1) T 与 < 一致,这意味着如果 op1 < op2 则 op1 在 T 中出现在 op2 之前,并且 (2) T 定义了正确的顺序执行 。 线性化传统上被用来定义并发数据结构的正确行为。 最近,它也被用于分布式系统。
Sequential consistency [14] is also a strong form of consistency, albeit weaker than linearizability. Intuitively, it requires that operations execute as if they were totally ordered in a way that respects the order in which each client issues operations. More precisely, we define a partial order < as follows: op1 < op2 iff both operations are executed by the same client and op1 finishes before op2 starts. There must exist a total order T such that (1) T is consistent with <, and (2) T defines a correct sequential execution. These conditions are similar to linearizability, except that < reflects just the local order of operations at each client. Sequential consistency is used to define a strongly consistent memory model of a computer, but it could also be used in the context of concurrent data structures.
顺序一致性[14]也是一致性的一种强形式,尽管比线性化弱。 直观上,它要求操作执行时就好像它们是完全有序的,并且尊重每个客户端发出操作的顺序。 更准确地说,我们定义一个偏序 < 如下: op1 < op2 当且仅当两个操作都由同一客户端执行并且 op1 在 op2 开始之前完成。 必须存在一个全序 T,使得 (1) T 与 < 一致,并且 (2) T 定义了正确的顺序执行。 这些条件类似于线性化能力,除了 < 只反映每个客户端的本地操作顺序。 顺序一致性用于定义计算机的强一致性内存模型,但它也可以用在并发数据结构的上下文中。
The next examples pertain to systems that support transactions. Intuitively, a transaction is a bundle of one or more operations that must be executed as a whole. More precisely, there are special operations to start, commit, and abort transactions; and operations on data items are associated with a transaction. The system provides an isolation property, which ensures that transactions do not significantly interfere with one another. There are many isolation properties: serializability, strong session serializability, order-preserving serializability, snapshot isolation, read committed, repeatable reads, etc. All of these are forms of operation consistency, and several of them are of the sequential equivalence subcategory. Here are some examples, all of which are used in the context of database systems.
下一个示例涉及支持事务的系统。 直观上,事务是一组必须作为一个整体执行的一个或多个操作。 更准确地说,有一些特殊的操作来启动、提交和中止事务; 对数据项的操作与事务相关联。 该系统提供了一种隔离属性,可确保事务不会严重干扰彼此。 隔离属性有很多:可序列化、强会话可序列化、保序可序列化、快照隔离、已提交读、可重复读等。所有这些都是操作一致性的形式,其中一些属于顺序等效子类别。 以下是一些示例,所有这些示例都在数据库系统的上下文中使用。
Serializability [5] intuitively guarantees that each transaction appears to execute in series. More precisely, serializability imposes a constraint on the operations in a system: the schedule corresponding to those operations must be equivalent to a serial schedule of transactions. The serial schedule is called a serialization of the schedule.
可串行性[5]直观地保证每个事务看起来都是串行执行的。 更准确地说,可串行性对系统中的操作施加了约束:与这些操作相对应的调度必须等于事务的串行调度。 串行调度称为调度的串行化。
Strong session serializability [8] addresses an issue with serializability. Serializability allows transactions of the same client to be reordered, which can be undesirable at times. Strong session serializability imposes additional constraints on top of serializability. More precisely, each transaction is associated with a session, and the constraint is that serializability must hold (as defined above) and the serialization must respect the order of transactions within every session: if transaction T1 occurs before T2 in the same session, then T2 is not serialized before T1.
强大的会话可串行性[8]解决了可串行性的问题。 可串行性允许对同一客户端的事务进行重新排序,这有时是不可取的。 强大的会话可串行性在可串行性之上施加了额外的约束。 更准确地说,每个事务都与一个会话关联,并且约束是可串行性必须保持(如上面所定义),并且序列化必须遵守每个会话中事务的顺序:如果事务 T1 在同一会话中发生在 T2 之前,则 T2 在 T1 之前未序列化。
Order-preserving serializability [24], also called strict serializability [6, 17] or strong serializability [7], requires that the serialization order respect the real-time ordering of transactions. More precisely, the constraint is that serializability must hold and the serialization must satisfy the requirement that, if transaction T1 commits before T2 starts, then T2 is not serialized before T1.
保序序列化[24],也称为严格序列化[6, 17]或强序列化[7],要求序列化顺序尊重事务的实时排序。 更准确地说,约束是可串行性必须成立,并且串行化必须满足以下要求:如果事务 T1 在 T2 开始之前提交,则 T2 不会在 T1 之前序列化。
3.2.2 Reference equivalence 参考等效
Reference equivalence is a generalization of sequential equivalence. It defines the permitted operation results by requiring the concurrent execution to be equivalent to a given reference, where the notion of equivalence and the reference depend on the consistency property. We now give some examples for systems with transactions. These examples occur often in the context of database systems.
参考等价是顺序等价的推广。 它通过要求并发执行等效于给定的引用来定义允许的操作结果,其中等效的概念和引用取决于一致性属性。 我们现在给出一些具有事务的系统的示例。 这些例子经常出现在数据库系统的环境中。
Snapshot isolation [4] requires that transactions behave identically to a certain reference implementation, that is, transactions must have the same outcome as in the reference implementation, and operations must return the same results. The reference implementation is as follows. When a transaction starts, it gets assigned a monotonic start timestamp. When the transaction reads data, it reads from a snapshot of the system as of the start timestamp. When a transaction T1 wishes to commit, the system obtains a monotonic commit timestamp and verifies whether there is some other transaction T2 such that (1) T2 updates some item that T1 also updates, and (2) T2 has committed with a commit timestamp between T1’s start and commit timestamp. If so, then T1 is aborted; otherwise, T1 is committed and all its updates are applied instantaneously as of the time of T1’s commit timestamp.
快照隔离[4]要求事务的行为与某个参考实现相同,即事务必须具有与参考实现相同的结果,并且操作必须返回相同的结果。 参考实现如下。 当事务开始时,它会被分配一个单调的开始时间戳。 当事务读取数据时,它从截至开始时间戳的系统快照中读取。 当事务 T1 希望提交时,系统获取单调提交时间戳并验证是否存在其他事务 T2,以便 (1) T2 更新 T1 也更新的某些项目,以及 (2) T2 已提交,且提交时间戳介于 T1 的开始和提交时间戳。 如果是,则 T1 中止; 否则,T1 将被提交,并且自 T1 的提交时间戳记起立即应用其所有更新。
Interestingly, the next two properties are examples of reference equivalence where the reference is itself defined by another consistency property. This other property is in the serial equivalence subcategory in the first example, and it is in the reference equivalence subcategory in the second example.
有趣的是,接下来的两个属性是引用等效的示例,其中引用本身是由另一个一致性属性定义的。 该其他属性在第一个示例中位于序列等效子类别中,在第二个示例中位于参考等效子类别中。
One-copy serializability [5] pertains to a replicated database system. The replicated system must behave like a reference system, which is a system that is not replicated and provides serializability.
单副本可串行化[5]适用于复制数据库系统。 复制系统的行为必须类似于参考系统,参考系统是一个不复制并提供可串行性的系统。
One-copy snapshot isolation [15] also pertains to a replicated system. The requirement is that it must behave like a system that is not replicated and that provides snapshot isolation.
单副本快照隔离[15]也适用于复制系统。 要求是它的行为必须像一个不复制且提供快照隔离的系统。
3.2.3 Read-write centric 以读写为中心
The above subcategories of operation consistency apply to systems with arbitrary operations. The read-write centric subcategory applies to systems with two very specific operations: read and write. These systems are important because they include many types of storage systems, such as block storage systems, key value storage systems, and processors accessing memory. By focusing on the two operations, this subcategory permits prop- erties that directly evoke the semantics of the operations. In particular, a write operation returns no information other than an acknowledgment or error status, which has no consistency implications. Thus, the consistency properties focus on the results of reads. Common to these properties is the notion of a read seeing the values of a set of writes, as we now explain. Each read is affected by some writes in the system; if every write covers the entire data item, then writes overwrite each other and the read returns the value written by one of them. But if the writes update just part of a data item, the read returns a combination of the written values in some appropriate order. In either case, the crucial consideration is the set of writes that could have potentially affected the read, irrespective of whether the writes are partial or not; we say that the read sees those writes. This notion is used to define several known consistency properties, as we now exemplify.
上述操作一致性的子类别适用于具有任意操作的系统。 以读写为中心的子类别适用于具有两个非常具体的操作的系统:读和写。 这些系统很重要,因为它们包括许多类型的存储系统,例如块存储系统、键值存储系统和访问内存的处理器。 通过关注这两个操作,该子类别允许直接唤起操作语义的属性。 特别是,写操作除了确认或错误状态之外不返回任何信息,这没有一致性影响。 因此,一致性属性集中于读取的结果。 正如我们现在所解释的,这些属性的共同点是读取看到一组写入的值的概念。 每次读都会受到系统中一些写的影响; 如果每次写入都覆盖整个数据项,则写入会相互覆盖,并且读取会返回其中之一写入的值。 但是,如果写入仅更新数据项的一部分,则读取将按某种适当的顺序返回写入值的组合。 无论哪种情况,关键的考虑因素是可能影响读取的一组写入,无论写入是否部分; 我们说读可以看到那些写。 正如我们现在举例的那样,这个概念用于定义几个已知的一致性属性。
Read-my-writes [21] requires that a read by a client sees at least all writes previously executed by the same client, in the order in which they were executed. This property is relevant when clients expect to observe their own writes, but can tolerate delays before observing the writes of others. Typically, read-my-writes is combined with another read-write consistency property, such as bounded staleness or operational eventual consistency, defined below. By combined we mean that the system must provide both read-my-writes and the other prop- erty. Read-my-writes was originally defined in the context of distributed systems [21], then used in computer architecture to define memory models [19].
Read-my-writes [21] 要求客户端的读取至少可以看到同一客户端先前执行的所有写入(按执行顺序排列)。 当客户端期望观察自己的写入,但在观察其他人的写入之前可以容忍延迟时,此属性是相关的。 通常,read-my-writes 与另一个读写一致性属性相结合,例如下面定义的有限陈旧性或操作最终一致性。 通过组合,我们的意思是系统必须提供“读我的写”和其他属性。 Read-my-writes 最初是在分布式系统的上下文中定义的 [21],然后在计算机体系结构中用于定义内存模型 [19]。
Bounded staleness [1], intuitively, bounds the time it takes for writes to be seen by reads. More precisely, the property has a parameter δ, such that a read must see at least all writes that complete δ time before the read started. This property is relevant when inconsistencies are tolerable in the short term as defined by δ, or when time intervals smaller than δ are imperceptible by clients (e.g., δ is in the tens of milliseconds and clients are humans). Bounded staleness was originally defined in the context of database systems [1] and has been used more recently in the context of cloud distributed systems [20].
直观上讲,有界陈旧性 [1] 限制了读取看到写入所需的时间。 更准确地说,该属性有一个参数 δ,这样读取必须至少看到在读取开始之前完成 δ 时间的所有写入。 当 δ 定义的短期内可以容忍不一致时,或者当客户无法察觉小于 δ 的时间间隔时(例如,δ 为数十毫秒且客户是人类),此属性是相关的。 有界过时性最初是在数据库系统 [1] 的上下文中定义的,最近已在云分布式系统 [20] 的上下文中使用。
Operational eventual consistency is a variant of eventual consistency (a form of state consistency) defined using operation consistency. The requirement is that each write be eventually seen by all reads, and if clients stop executing writes then eventually every read returns the same latest value [22].
操作最终一致性是使用操作一致性定义的最终一致性(状态一致性的一种形式)的变体。 要求是每次写入最终都会被所有读取看到,并且如果客户端停止执行写入,则最终每次读取都会返回相同的最新值[22]。
Cache coherence originates from computer architecture to define the correct behavior of a memory cache. Intuitively, cache coherence requires that reads and writes to an individual data item (a memory location) satisfy some properties. The properties vary across the literature. One possibility [11] is to require that, for each data item: (1) a read by some client returns the value of the previous write by that client, unless another client has written in between, (2) a read returns the value of a write by another client if the write and read are sufficiently separated in time and if no other write occurred in between, and (3) writes are serialized.
缓存一致性源自计算机体系结构,用于定义内存缓存的正确行为。 直观上,缓存一致性要求对单个数据项(内存位置)的读取和写入满足某些属性。 各个文献的特性各不相同。 一种可能性[11]是要求,对于每个数据项:(1)某个客户端的读取返回该客户端先前写入的值,除非另一个客户端在其间写入,(2)读取返回该值 如果写入和读取在时间上充分分离并且中间没有发生其他写入,则由另一个客户端进行写入,并且 (3) 写入被串行化。
3.3 Discussion
We now compare state consistency and operation consistency in terms of their level of abstraction, complexity, power, and application dependence.
现在,我们从抽象级别、复杂性、功能和应用程序依赖性方面比较状态一致性和操作一致性。
3.3.1 Level of abstraction
Operation consistency is an end-to-end property, because it deals with results that clients can observe directly. This is in contrast to state consistency, which deals with system state that clients observe indirectly by executing operations. In other words, operation consistency is at a higher level of abstraction than state consistency. As a result, a system might have significant state inconsistencies, but hide these inconsistencies externally to provide a strong form of operation consistency.
操作一致性是一个端到端的属性,因为它处理的是客户端可以直接观察到的结果。 这与状态一致性相反,状态一致性处理客户端通过执行操作间接观察到的系统状态。 换句话说,操作一致性比状态一致性处于更高的抽象级别。 因此,系统可能存在严重的状态不一致,但将这些不一致隐藏在外部以提供强大的操作一致性形式。
An interesting example is a storage system with three servers replicated using majority quorums [3], where (1) to write data, the system attaches a monotonic timestamp and stores the data at two (a majority of) servers, and (2) to read, the system fetches the data from two servers; if the servers return the same data, the system returns the data to the client; otherwise, the system picks the data with the highest timestamp, stores that data and its timestamp in another server (to ensure that two servers have the data), and returns the data to the client. This system violates mutual consistency, because when there are no outstanding operations, one of the servers deviates from the other two. However, this inconsistency is not observable in the results returned by reads, since a read filters out the inconsistent server by querying a majority. In fact, this storage system satisfies linearizability, one of the strongest forms of operation consistency.
一个有趣的例子是一个具有使用多数仲裁[3]复制的三台服务器的存储系统,其中(1)写入数据,系统附加单调时间戳并将数据存储在两台(大多数)服务器上,以及(2) read,系统从两台服务器获取数据; 如果服务器端返回的数据相同,则系统将数据返回给客户端; 否则,系统选择具有最高时间戳的数据,将该数据及其时间戳存储在另一台服务器中(以确保两台服务器都有该数据),并将数据返回给客户端。 该系统违反了相互一致性,因为当没有未完成的操作时,其中一台服务器会偏离其他两台服务器。 然而,这种不一致在读取返回的结果中是观察不到的,因为读取通过查询多数来过滤掉不一致的服务器。 事实上,该存储系统满足线性化,这是操作一致性的最强形式之一。
3.3.2 Complexity
Operation consistency is more complex than state consistency. With state consistency, developers gain a direct understanding of what states they can expect from the system. Each property concerns specific data items that do not depend on the execution. As a result, state consistency is intuitive and simple to express and under- stand. Moreover, state consistency can be checked by analyzing a snapshot of the system state, which facilitates debugging.
操作一致性比状态一致性更复杂。 通过状态一致性,开发人员可以直接了解他们可以从系统中获得哪些状态。 每个属性都涉及不依赖于执行的特定数据项。 因此,状态一致性直观且易于表达和理解。 而且,可以通过分析系统状态快照来检查状态一致性,从而方便调试。
By contrast, operation consistency properties establish relations between operations that are spread over time and possibly over many clients, which creates complexity. This complexity makes operation consistency less intuitive and harder to understand, as can be observed from the examples in Section 3.2. Moreover, checking operation consistency requires analyzing an execution log, which complicates debugging.
相比之下,操作一致性属性在随时间分布并且可能分布在许多客户端的操作之间建立关系,这会产生复杂性。 这种复杂性使得操作一致性变得不那么直观并且更难以理解,从 3.2 节中的示例可以看出。 而且,检查操作一致性需要分析执行日志,这使得调试变得复杂。
3.3.3 Power
Operation consistency and state consistency have different powers. Operation consistency can see all operations in the system, which permits constraining the ordering and results of operations. If the system is deterministic, operation consistency properties can reconstruct the state of the system from the operations, and thereby indi- rectly constrain the state much like state consistency. But doing so is not generally possible when the system is non-deterministic (e.g., due to concurrency, timing, or external events).
操作一致性和状态一致性具有不同的权力。 操作一致性可以看到系统中的所有操作,这允许约束操作的顺序和结果。 如果系统是确定性的,操作一致性属性可以从操作中重建系统的状态,从而像状态一致性一样间接约束状态。 但当系统不确定时(例如,由于并发、计时或外部事件),这样做通常是不可能的。
State consistency, on the other hand, can see the entire state of the system, which permits constraining operations that might break the state. If the system records all its operations in its state, then state consistency can indirectly constrain the results of operations much like operation consistency.1 However, it is often prohibitive to record all operations so this is only a theoretical capability.
另一方面,状态一致性可以看到系统的整个状态,这允许限制可能破坏状态的操作。 如果系统在其状态中记录其所有操作,则状态一致性可以间接约束操作的结果,就像操作一致性一样。[^1] 但是,记录所有操作通常是禁止的,因此这只是一种理论上的能力。
[^1]: It is even possible to constrain all operations of the entire execution, though enforcing such constraints would be hard. 甚至可以约束整个执行的所有操作,尽管强制执行此类约束会很困难。
3.3.4 Application dependence
State consistency tends to be application dependent, because the properties concern state, and the correct state of a system varies significantly from application to application. As a result, developers need to figure out the right properties for each system, which takes time and effort. Moreover, in some cases there are no general mechanisms to enforce state consistency and developers must write application code that is closely tied to each property. There are two noteworthy exceptions: mutual consistency and eventual consistency. These properties apply broadly to any replicated system, by referring to the replicated state irrespective of the application, and there are general replication mechanisms to enforce such properties.
状态一致性往往依赖于应用程序,因为属性涉及状态,并且系统的正确状态因应用程序而异。 因此,开发人员需要为每个系统找出正确的属性,这需要时间和精力。 此外,在某些情况下,没有通用机制来强制状态一致性,开发人员必须编写与每个属性紧密相关的应用程序代码。 有两个值得注意的例外:相互一致性和最终一致性。 这些属性通过引用复制状态而广泛适用于任何复制系统,而与应用程序无关,并且存在通用复制机制来强制执行这些属性。
Operation consistency is often application independent. It achieves application independence in two ways. First, some properties factor out the application-specific behavior, by reducing the behavior of the system under concurrent operations to behavior under sequential operations (as in the sequential equivalence subcategory), or behavior under a reference (as in the reference equivalence subcategory). Second, some properties focus on specific operations, such as read and write, that apply to many systems (as in the read-write centric subcategory). Theoretically, operation consistency can be highly application dependent, but this is not common. An example might be an email system accessible by many devices, where each operation (read, delete, move) might have different constraints on their response according to their semantics and the expectations of users.
操作一致性通常与应用程序无关。 它通过两种方式实现应用程序独立性。 首先,一些属性通过将并发操作下的系统行为减少为顺序操作下的行为(如顺序等效子类别中的行为)或引用下的行为(如引用等效子类别中的行为)来分解特定于应用程序的行为。 其次,某些属性专注于适用于许多系统的特定操作,例如读取和写入(如在以读写为中心的子类别中)。 理论上,操作一致性可能高度依赖于应用程序,但这并不常见。 一个示例可能是可由许多设备访问的电子邮件系统,其中每个操作(读取、删除、移动)可能根据其语义和用户的期望对其响应有不同的限制。
3.3.5 Which type to use?
To decide what type of consistency to use, we suggest taking a few things into consideration. First, think about the negation of consistency: what are the inconsistencies that must be avoided? If the answer is most easily described by an undesirable state (e.g., two replicas diverge), then use state consistency. If the answer is most easily described by an incorrect result to an operation (e.g., a read returns stale data), then use operation consistency.
要决定使用哪种类型的一致性,我们建议考虑一些事项。 首先,思考一致性的否定:哪些是必须避免的不一致? 如果答案最容易通过不良状态来描述(例如,两个副本存在分歧),则使用状态一致性。 如果答案最容易通过操作的错误结果来描述(例如,读取返回过时的数据),则使用操作一致性。
A second important consideration is application dependency. Many operation consistency and some state consistency properties are application independent (e.g., serializability, linearizability, mutual consistency, even- tual consistency). We recommend trying to use such properties, before defining an application-specific one, because the mechanisms to enforce them are well understood. If the system requires an application specific property, and state and operation consistency are both natural choices, then we recommend using state consis- tency due to its simplicity.
第二个重要的考虑因素是应用程序依赖性。 许多操作一致性和一些状态一致性属性是独立于应用程序的(例如,可串行性、可线性性、相互一致性、最终一致性)。 我们建议在定义特定于应用程序的属性之前尝试使用此类属性,因为强制执行它们的机制很容易理解。 如果系统需要应用程序特定的属性,并且状态和操作一致性都是自然的选择,那么我们建议使用状态一致性,因为它很简单。
4 Consistency in different disciplines
不同学科的一致性
We now discuss what consistency means in each discipline, why it is relevant in that discipline, and how it relates to the two types of consistency in Section 3. We also point out concepts that are considered to be consistency in one discipline but not in another.
我们现在讨论一致性在每个学科中的含义,为什么它与该学科相关,以及它与第 3 节中的两种类型的一致性有何关系。我们还指出了在一个学科中被认为是一致性但在另一个学科中则不然的概念 。
4.1 Distributed systems
In distributed systems, consistency refers to either state or operation consistency. Early replication protocols focused on providing mutual consistency while many cloud distributed systems provide eventual consistency. These are examples of state consistency. Some systems aim at providing linearizability or various flavors of read-write centric consistency. These are examples of operation consistency.
在分布式系统中,一致性指的是状态一致性或操作一致性。 早期的复制协议侧重于提供相互一致性,而许多云分布式系统提供最终一致性。 这些是状态一致性的例子。 一些系统旨在提供线性化或各种以读写为中心的一致性。 这些都是操作一致性的例子。
Consistency is an important consideration in distributed systems because such systems face many concerns that preclude or hinder consistency: clients separated by a slow network, machines that fail, clients that discon- nect from each other, scalability of the system to a large number of clients, and high availability. These concerns can make it hard to provide strong levels of consistency, because consistency requires client coordination that may not be possible. As a result, distributed systems may adopt weaker levels of consistency, chosen according to the needs of applications.
一致性是分布式系统中的一个重要考虑因素,因为此类系统面临许多阻碍或阻碍一致性的问题:客户端被慢速网络分隔、机器发生故障、客户端彼此断开连接、系统对大量客户端的可扩展性 和高可用性。 这些问题可能会使提供强级别的一致性变得困难,因为一致性需要客户端协调,而这可能是不可能的。 因此,分布式系统可能会采用根据应用程序的需要选择的较弱的一致性级别。
Cloud systems, an interesting type of distributed system, face all of the above concerns with intensity: the systems are geo-distributed (distributed around the globe) with significant latency separating data centers; ma- chines fail often because there are many of them; clients disconnect from remote data centers due to problems or congestion in wide-area links; many clients are active and the system must serve all of them well; and the system must be available whenever possible since businesses lose money during downtime. Because of these challenges, cloud systems often resort to weak levels of consistency.
云系统是一种有趣的分布式系统,它面临着上述所有严重问题:系统是地理分布式的(分布在全球各地),数据中心之间存在显着的延迟; 机器经常出故障,因为数量太多; 由于广域链路出现问题或拥塞,客户端与远程数据中心断开连接; 许多客户都是活跃的,系统必须为所有客户提供良好的服务; 并且系统必须尽可能可用,因为企业在停机期间会损失资金。 由于这些挑战,云系统通常采用弱一致性级别。
4.2 Database systems
In database systems, consistency refers to state consistency. For example, consider the ACID acronym that de- scribes the guarantees of transactions. The “C” stands for consistency, which in this case means that the database is always in a state that developers consider valid: the system must preserve invariants such as uniqueness con- straints, referential integrity, and application-specific properties (e.g., x is a friend of y iff y is a friend of x). These are flavors of state consistency.
在数据库系统中,一致性是指状态一致性。 例如,考虑描述事务保证的 ACID 缩写。 “C”代表一致性,在这种情况下意味着数据库始终处于开发人员认为有效的状态:系统必须保留诸如唯一性约束、引用完整性和特定于应用程序的属性(例如,x 是 y 的朋友当且仅当 y 是 x 的朋友)。 这些都是状态一致性的体现。
The “A” stands for atomicity and the “I” stands for isolation. Interestingly, atomicity and isolation are examples of operation consistency. Atomicity requires that a transaction either executes in its entirety or does not execute at all, while isolation requires that transactions appear to execute by themselves without much interference. There are many different levels of isolation (serializability, snapshot isolation, read committed, repeatable reads, etc), but they all constrain the behavior of operations.
“A”代表原子性,“I”代表隔离。 有趣的是,原子性和隔离性是操作一致性的例子。 原子性要求事务要么完整执行,要么根本不执行,而隔离性则要求事务看起来自己执行而没有太多干扰。 有许多不同级别的隔离(可串行性、快照隔离、已提交读、可重复读等),但它们都限制了操作的行为。
Although the database systems community separates transaction isolation from consistency and atomicity, in the distributed systems community, transaction isolation is seen as a form of consistency, while in the computer architecture community, a concept analogous to isolation is called atomicity. We do not know exactly why these terms have acquired different meanings across communities. But we suspect that a reason is that there are intertwined ideas across these concepts, which is something we try to identify and clarify in this article.
尽管数据库系统社区将事务隔离与一致性和原子性分开,但在分布式系统社区中,事务隔离被视为一致性的一种形式,而在计算机体系结构社区中,类似于隔离的概念称为原子性。 我们并不确切知道为什么这些术语在不同社区中具有不同的含义。 但我们怀疑其中一个原因是这些概念之间存在相互交织的想法,这是我们在本文中试图识别和澄清的内容。
Consistency is important in database systems because data is of primary concern; in fact, data could be even more important than the result of operations in such systems (e.g., operations can fail as long as data is not destroyed). Different types of consistency arise because of the different classes of invariants that exist in the database, each with its own enforcement mechanism. For example, uniqueness constraints are enforced by an index and checks in the execution engine; application-specific constraints are enforced by the application logic; and mutual consistency is enforced by the replication manager.
一致性在数据库系统中很重要,因为数据是首要关注的; 事实上,在此类系统中,数据可能比操作结果更重要(例如,只要数据不被破坏,操作就可能失败)。 不同类型的一致性是由于数据库中存在不同类别的不变量而产生的,每个不变量都有自己的执行机制。 例如,唯一性约束由执行引擎中的索引和检查强制执行; 应用程序特定的约束由应用程序逻辑强制执行; 相互一致性由复制管理器强制执行。
4.3 Computer architecture
In computer architecture, consistency refers to operation consistency. A similar concept called coherence is also a form of operation consistency. Consistency and coherence have a subtle difference. Consistency concerns the entire memory system; it constrains the behavior of reads and writes—called loads and stores—across all the memory locations; an example is the sequential consistency property. Coherence concerns the cache subsystem; it can be seen as consistency of the operation of the various caches responsible for a given memory location. Thus, coherence constrains the behavior of loads and stores to an individual memory location.
在计算机体系结构中,一致性是指操作的一致性。 称为一致性的类似概念也是操作一致性的一种形式。 一致性和连贯性有细微的差别。 一致性涉及整个内存系统; 它限制所有内存位置的读取和写入行为(称为加载和存储); 一个例子是顺序一致性属性。 一致性涉及缓存子系统; 它可以被视为负责给定内存位置的各种高速缓存的操作的一致性。 因此,一致性将加载和存储的行为限制在单个内存位置。
Coherence and consistency are separated to permit a modular architecture of the system: a cache coherence protocol ensures the correct behavior of the caching subsystem, while the rest of the system ensures consistency across memory accesses without worrying about the cache subsystem.
一致性和一致性是分开的,以允许系统的模块化架构:缓存一致性协议确保缓存子系统的正确行为,而系统的其余部分确保内存访问之间的一致性,而无需担心缓存子系统。
Consistency and coherence arise as issues in computer architecture because increasingly computer systems have many cores or processors sharing access to a common memory: in such systems, there are concurrent operations on memory locations and data replication across many caches, which lead to problems of data sharing.
一致性和连贯性成为计算机体系结构中的问题,因为越来越多的计算机系统有许多核心或处理器共享对公共内存的访问:在这样的系统中,内存位置上存在并发操作,并且跨多个缓存进行数据复制,这会导致数据共享问题 。
5 Conclusion
Consistency is a concern that spans many disciplines, as we briefly described here. This concern stems from the rise of concurrency and replication across these disciplines, a trend that we expect to continue. Unfortunately, consistency is subtle and hard to grasp, and to make matters worse, it has different names and meanings across communities. We hope to have shed some light on this subject by identifying two broad and very different types of consistency—state consistency and operation consistency—that can be seen across the disciplines.
正如我们在这里简要描述的那样,一致性是一个跨越许多学科的问题。 这种担忧源于这些学科的并发和复制的兴起,我们预计这一趋势将持续下去。 不幸的是,一致性是微妙且难以掌握的,更糟糕的是,它在不同的社区中有不同的名称和含义。 我们希望通过确定跨学科可以看到的两种广泛且截然不同的一致性类型(状态一致性和操作一致性)来阐明这个主题。