What is replication?
Keeping the same data on multiple machines that are connected via a network
What are the reasons one may want to replicate data?
Whate the three main approaches to replicating changes between nodes?
What is a replica?
A node/server that stores a copy of the database.
Every write needs to be processed by ever replica, otherwise the replicas no longer contain the same data.
What is leader-based replication?
What are synchronous and asynchronous replication?
Synchronous: The leader waits for the follower to confirm it has recieved the write before reporting success to the user
Asynchronous: The leader sens the message to the follower replica but does not wait for a response
What are the disadvantages of synchronous replicatoin?
What are the advantages of synchronous replicatoin?
What are the advantages of asynchronous replication?
How can we add new followers in leader-based replication?
How can we handle node outages for followers in leader-based replication?
How can we handle node outages for leader in leader-based replication?
What is statement-based replication?
What are the potential pitfalls of statemened-based replication?
What is write-ahead log shipping?
What are the disadvantages of write-ahead log shipping?
What is logical (rows-based) log replication?
What is trigged-based replication?
Lets you register custom application code that is automatically executed when a data change (write transaction) occurs in a database system.
This custom application code or external process can then replicate the data change to another system
What is read-after-write or read-your-write consistency?
What are monotonic reads?
A guarantee that if a user makes several reads in sequence, they won’t read older data after having previously read newer data
e.g read from a follower and get 2 comments, then read from another follower with more lag and only get 1st comment
Can be implemented by making sure users always read from the same replica
What are consistent prefix reads?
A guarantee that if a sequence of writes happens in a certain order, anyone reading those writes will see them appear in the same order
If the database always applied writes in the same order, reads always see a consistent prefix – this is more of a problem for partitioned databases
What is a multi-leader configuration?
There are multiple leaders in the database topology, each leader can both be written to and acts as a follower to other leaders
The benefits rarely outweight the added complexity
What are some disavantages of multi-leader replication?
The same data may be concurrently modified in two different datacenters
Those write conflicts must be resolved
Describe a simple write conflict in a multi-leader database
A wiki page is being simultaneously edited by two users
User 1 changes the title from A to B
User 2 changes the title from A to C
Each users change is successfully applied to their local leader but when the change is asynchronously replicated a conflict is detected