Distributed Transactions (L10) Flashcards

(23 cards)

1
Q

What is a “Transaction” in the context of databases?

A

A group of operations that modify some data, has exclusive access to it, and all operations complete successfully, or none do.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does “ACID” stand for in the context of transaction properties?

A

(A)tomicity: Guarantees that partial failures are not possible; transactions are either committed successfully or rolled back.
(C)onsistency (Correctness): Guarantees application-level invariants are preserved. (Note: This “C” is different from consistency in CAP or consistency models).
(I)solation: Guarantees that the concurrent execution of transactions does not cause race conditions.
(D)urability: Guarantees that once a transaction is committed, the changes persist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is consensus used in distributed transactions (e.g., with 2PC)?

A

Consensus is not used to vote on the transaction outcome itself. Instead, it’s used in a second phase to agree on the final outcome (commit or abort) once it has been determined by the transaction protocol (like 2PC). This ensures the decision is reliably replicated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the role of participants in a distributed transaction.

A

All participants must vote “Yes” (agree to commit) before a transaction can be committed. Even one “No” vote (or timeout) means the transaction must abort. It’s about ensuring atomicity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Atomicity in ACID properties?

A

It guarantees that all operations within a transaction are completed successfully (committed), or if any part fails, the entire transaction is rolled back as if it never happened. Partial failures are not possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Consistency (Correctness) in ACID properties?

A

It guarantees that a transaction brings the database from one valid state to another, preserving application-level invariants. This is distinct from consistency in consistency models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Isolation in ACID properties?

A

It guarantees that the concurrent execution of transactions does not lead to race conditions. The system behaves as if transactions are executed sequentially, even if they run concurrently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Name and briefly describe common race conditions that Isolation levels protect against.

A

Dirty Write: A transaction overwrites a value written by another uncommitted transaction.
Dirty Read: A transaction observes a write from an uncommitted transaction.
Fuzzy Read (Non-Repeatable Read): A transaction reads an object twice and sees different values because another committed transaction updated it in between.
Phantom Read: A transaction reads a set of objects matching a condition, and another transaction adds/updates/deletes an object matching the same condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Serializability as an isolation level?

A

It is the strongest isolation level, ensuring that the execution of concurrent transactions is equivalent to some serial execution order of those transactions, guarding against all possible race conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are two common concurrency control mechanisms to achieve Serializability?

A

Pessimistic: Two-Phase Locking (2PL)
Optimistic: **Multi-Version Concurrency Control (MVCC) **

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Durability in ACID properties?

A

It guarantees that once a transaction has been committed, its changes are permanently stored and will survive system failures (e.g., power loss, hardware failures)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Two-Phase Commit Protocol (2PC)?

A

A distributed algorithm that ensures all participating processes in a distributed transaction either commit or abort. It involves a coordinator and participants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe Phase 1 of the Two-Phase Commit (2PC) protocol.

A

Phase 1a (Vote Request): The coordinator sends a VOTE-REQUEST (or pre-write) message to all participants.
Phase 1b (Voting): Each participant, upon receiving the VOTE-REQUEST, decides if it can commit. It writes its decision (and data) to a log and sends either VOTE-COMMIT or VOTE-ABORT to the coordinator. If it votes VOTE-ABORT, it aborts locally.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe Phase 2 of the Two-Phase Commit (2PC) protocol.

A

Phase 2a (Decision): The coordinator collects votes. If all participants send VOTE-COMMIT, the coordinator decides to commit and sends a GLOBAL-COMMIT message to all participants. Otherwise (any VOTE-ABORT or timeout), it decides to abort and sends GLOBAL-ABORT. The coordinator logs its decision.
Phase 2b (Completion): Participants wait for the coordinator’s decision. Upon receiving GLOBAL-COMMIT or GLOBAL-ABORT, they act accordingly (commit or abort their local transaction) and send an acknowledgement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a major problem with 2PC if the coordinator crashes?

A

If the coordinator crashes after sending VOTE-REQUEST but before broadcasting the final decision (GLOBAL-COMMIT/ABORT), participants that voted VOTE-COMMIT are blocked. They cannot unilaterally decide to commit or abort because that might violate atomicity. The protocol blocks until the coordinator recovers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Linearizability (Strong Consistency / Atomic Consistency)?

A

It ensures that every operation appears to take effect atomically at some point between its invocation and its completion. All operations behave as if executed on a single, up-to-date copy of the data, even with multiple replicas.

17
Q

Is Linearizability the same as Serializability?

A

No. Linearizability is about the recency of single operations, while Serializability is about the equivalence of concurrent transaction executions to some serial order. The combination of both is called Strict Serializability or One-Copy Serializability.

18
Q

What are recovery points in the context of Durability?

A

Snapshots of a system’s state (checkpoints) to which the system can return in case of a failure, as part of backward error recovery.

19
Q

What is a consistent global checkpoint (recovery line)?

A

A set of local checkpoints, one for each process, such that no message sent by one process after its checkpoint is received by another process before its checkpoint. Every message received is shown to have been sent in the sender’s state.

20
Q

Describe Coordinated Checkpointing.

A

A method where a coordinator multicasts a checkpoint request. Participants take a checkpoint, stop sending application messages, and report back. Once all confirm, the coordinator broadcasts a “checkpoint done” message.

21
Q

Describe Independent Checkpointing.

A

Each process independently takes checkpoints. To reconstruct a consistent state, dependencies between checkpoint intervals (due to message passing) are tracked. This can lead to cascaded rollbacks.

22
Q

What is Cascaded Rollback?

A

A situation in independent checkpointing where the rollback of one process to an earlier checkpoint can trigger the rollback of other processes, potentially all the way back to the system’s startup if checkpoints are taken at “wrong” instants.

23
Q

What is Message Logging for durability/recovery?

A

Instead of frequent (expensive) checkpoints, processes log non-deterministic events (like message receipts). In case of failure, a process can replay its behavior from the most recent checkpoint using the logged messages. This assumes a piecewise deterministic execution model.