Data Consistency in Distributed Systems

Replica consistency

Unfortunately the word β€œconsistency” means different things in different contexts.

What do we mean with Consistency in a distributed systems?

  • ACID? no
  • read-after-write consistency?
  • replication (replica should be consistent with other replicas)?
  • consistency model? many to choose from

Two-phase commit

Protocol to achieve an atomic commit in a distributed system.

Martin Kleppmann – Lecture Distributed Systems (University of Cambridge)-20240218203237483.webp

Challenge

ferreiraloff.etal.2023.antipodeenforcingcrossservice (pg. 1)

Although consistency is well-studied in the context of individual distributed datastores, new issues appear when an application uses multiple independent distributed datastores. In particular, a single end-to-end request can make multiple writes to multiple different datastores over the course of its execution; these writes are issued by the different services (and machines) traversed by the request. As a whole, the request establishes an implicit visibility ordering for its writes, which readers must respect if they are to be consistent

laigner.etal.2021.datamanagementmicroservices (pg. 1)

In particular, in the monolithic architectural style, transactions can be easily executed across modules, while, in microservices, it becomes necessary to break these transactions down due to the decomposition of the application into small parts.

Approaches

ferreiraloff.etal.2023.antipodeenforcingcrossservice (pg. 13)

Existing noteworthy systems that address cross-service consistency take one of the following approaches: they either wrap requests in other abstractions, resort to centralized coordination mechanisms, add transactions to the design, or propose an overall revision of the system architecture.

2-Phase-Commit

2-Phase Commit Protocol

Centralized coordination

ferreiraloff.etal.2023.antipodeenforcingcrossservice (pg. 13)

Traditionally, for multiple systems to interact, the use of a logically centralized coordination mechanism like Zookeeper [39] was the natural design choice. However, strongly consistent coordination systems introduce a performance bottleneck and go against microservice principles. In particular, the proponents of this class of architectures encourage the community to embrace eventual consistency due to its better performance and higher scalability [34].

Saga

Saga Pattern

Redesign

ferreiraloff.etal.2023.antipodeenforcingcrossservice (pg. 14)

There are also proposals that opt for a complete redesign of the application, often ending up merging different services – and their respective datastores – into a single one. A good example of this approach is Diamond [64], where a reactive application with two different datastores (distributed storage and notification service), was merged into a single service that provided both functionalities.

Antipode

Antipode is a system that enforces cross-service causal consistency for applications with requests that span multiple processes and interact with multiple datastores.

Research paper: @FerreiraLoff.etal.2023.Antipode