Data Replication (Part 1/2)

Introduction:

Replication is simply keeping a copy of our data on multiple nodes that are connected via a network. But why do we need replication?

Performance: When the data is replicated across multiple machines which are geographically distributed, users can access the data from the closest node which can result in faster retrieval times.
Availability: Machines fail for a numerous reasons, which cannot be avoided. Having replicas of data helps us when one of the machines fails.
Scalability: As the amount of data grows, additional nodes can be added to the system, and data can be replicated across these nodes. This allows us to scale our system horizontally, which helps us handle growing amounts of data.

With all of the benefits that replication provides, it also introduces certain complexities like:

Consistency: Data consistency across all replicas can be difficult to maintain all the time.
Node failures: Handling leader failure and follower failures can be challenging.

Types of Replication

Replicating the data written in the primary node into the secondary nodes can be done in either synchronous or asynchronous fashion.

Synchronous replication

In synchronous replication, the primary node receives a write request, saves it to local storage, then sends the data change to all the secondary nodes and waits for the confirmation from all the secondary nodes that the data was successfully replicated before returning success to the client.

Synchronous replication maintains the highest level of consistency as at any given time, the data from each node is exactly the same. However, the problem with this replication is the write lag because every write operation must wait until the data is updated on all secondary nodes.

<aside> 📎 Synchronous replication - high consistency, low availability.

</aside>

Furthermore, if any of the secondary node fails, the primary node will be unable to acknowledge the client, resulting in higher latencies. For those reasons, synchronous replication is rarely used in production grade systems.

Asynchronous replication

In asynchronous replication, the primary node receives a write request, saves it to local storage, then sends the data change to all the secondary nodes and responds back to the client without waiting for the secondary nodes to complete the replication.

Asynchronous replication makes the writes faster as the primary node is not waiting for the replicas acknowledgement and failure of any of the secondary nodes wont impact the systems availability.

<aside> 📎 Asynchronous replication - high availability, low consistency.

</aside>