Paxos, gorums Flashcards by Martin Dimmen

Q: What is Gorums and what problem does it solve?

A:
Framework for building fault-tolerant distributed systems in Go

Key features:
Quorum-based communication
Built on gRPC
Code generation from protobuf
Failure handling built-in

Solves problems of:
Manual quorum handling
Network communication complexity
Distributed error handling
Type-safe RPC generation

Simple Explanation:
Gorums makes it easier to build systems that need agreement from multiple servers. Instead of manually counting votes or handling network issues, Gorums does this for you. Think of it as a smart messenger that knows how many responses you need before proceeding.

How well did you know this?

Not at all

Perfectly

Q: How does Gorums implement quorum specifications (QSpec)?

A:
Core concept: QSpec interface defines how to process responses
Two main components:
Quorum size (e.g., majority: n/2 + 1)
Quorum functions (QF) that process replies
Example from Paxos:

type PaxosQSpec struct {
quorum int // (n/2 + 1)
}

func (qs PaxosQSpec) PrepareQF(prepare PrepareMsg, replies map[uint32]PromiseMsg) (*PromiseMsg, bool)

A QSpec is like a vote counter with rules. You tell it how many votes you need (quorum) and what makes a vote valid (quorum function). When responses come in, it applies these rules automatically.

How well did you know this?

Not at all

Perfectly

Q: How does Gorums handle shared memory versus message passing?

A:
Gorums uses message passing, not shared memory:
Communication model:
All state transfer via gRPC messages
No direct memory sharing between nodes
Each node maintains its own state copy

Consistency through:
Quorum-based voting
State replication via messages
Explicit state transfer protocols

Benefits:
Better fault isolation
No memory synchronization issues
Cleaner separation between nodes
Scales across network boundaries

Simple Explanation:
Instead of sharing memory directly, nodes in Gorums talk by sending messages. Each node keeps its own copy of data, and they stay in sync by voting on changes. This is safer than sharing memory directly, especially when nodes might crash or network might fail.

How well did you know this?

Not at all

Perfectly

Q: How does Gorums integrate with gRPC?

A:
Built on top of gRPC with extensions:
Components:
Protobuf service definitions
Generated Gorums-specific code
Custom RPC patterns for quorums

Example configuration:

mgr := NewManager(
gorums.WithGrpcDialOptions(
grpc.WithTransportCredentials(insecure.NewCredentials()),
))
config, err := mgr.NewConfiguration(qspec, gorums.WithNodeMap(nodeMap))

Gorums uses gRPC’s reliable communication but adds special features for group communication. It’s like having gRPC’s reliable phone calls but with the ability to conference call and count votes from multiple participants.

How well did you know this?

Not at all

Perfectly

Q: How does Gorums handle failure detection and timeouts?

A:
Built-in failure handling mechanisms:
Timeout handling:
Context-based timeouts
Configurable deadline per call
Automatic cleanup of timed-out calls
Failure detection:
Node health monitoring
Automatic removal of failed nodes
Reconfiguration support
*

Gorums watches for problems like slow or crashed servers. It sets time limits for responses and automatically handles cases where servers don’t respond in time. This helps your system keep running even when some parts fail.

How well did you know this?

Not at all

Perfectly

Q: What are the main types of shared memory access in the Multi-Paxos implementation?

Two primary mechanisms:
Protected Shared State (Mutex-based):
Replica state (leader status, configuration)
Paxos component state (proposer, acceptor, learner)
Client handler response mappings

Communication Channels:
Leader change notifications
Client value queue
Response channels for clients

Simple Explanation:
The system uses both traditional locked shared memory (like a locked diary multiple people need to write in) and channels (like passing messages through a pipe). Locks protect shared state that needs updating, while channels handle communication between different parts.

“Mutexes lock the ‘current leader’ variable during updates, while channels notify components (e.g., ‘Leader changed! Update your state!’).”

How well did you know this?

Not at all

Perfectly

Q: How does Multi-Paxos handle shared state in its core components?

A:
Each core component has protected state:
Proposer:
Apply to multipaxos.p…
}
Acceptor:
Apply to multipaxos.p…
}
Learner:
Apply to multipaxos.p…
}
When Used:
During phase transitions
Value updates
State queries
Simple Explanation:
Each Paxos role (Proposer, Acceptor, Learner) keeps its own state safe using locks. It’s like having a special key for each diary - only one person can write at a time, preventing confusion.

How well did you know this?

Not at all

Perfectly

Q: How does the system handle leader changes with shared memory?

A:
Three-part mechanism:

Subscription Channel:
leaderChangeChannel := r.leaderDetector.Subscribe()

State Protection:
func (r *Replica) setLeader(isLeader bool) {
r.mu.Lock()
r.isLeaderFlag = isLeader
r.mu.Unlock()
}

Event Processing:
case newLeaderID := <-leaderChangeChannel:
r.handleLeaderChange(ctx, newLeaderID)

Simple Explanation:
Leader changes use both types of shared memory: channels to notify about changes, and protected state to safely update who’s leader. It’s like having both an announcement system (channels) and a protected whiteboard (mutex) for tracking leadership.

How well did you know this?

Not at all

Perfectly

Q: How does client request handling use shared memory?

Multi-layered approach:
Value Queue:
clientValueQueue: make(chan *pb.Value, multipaxos.Alpha)

Response Tracking:
type clientHandler struct {
mu sync.Mutex
responseChanMap map[uint64]chan *pb.Response
}

Used when:
Receiving client requests
Processing responses
Leader changes
Request completion
Simple Explanation:
Client handling uses both queues (channels) for incoming requests and protected maps for tracking responses. It’s like having both a ticket system (queue) and a protected ledger (mutex-protected map) for handling customer requests.

How well did you know this?

Not at all

Perfectly

Q: Why does Multi-Paxos use both mutexes and channels for shared memory?

A:
Different needs require different solutions:
Mutexes Used For:
Protecting critical state
Ensuring atomic updates
Preventing race conditions

Channels Used For:
Asynchronous communication
Event notification
Request queuing
Clean shutdown handling

Benefits:
Better separation of concerns
Natural Go concurrency patterns
Clearer communication flows
Easier deadlock prevention

Simple Explanation:
Mutexes and channels serve different purposes - mutexes protect shared data (like a lock on a diary), while channels handle communication (like a message pipeline). Using both gives the best of both worlds: safe data access and clean communication.

How well did you know this?

Not at all

Perfectly

Q: What triggers round number (ballot number) advancement in Paxos?

A:
Two main triggers:
New Leader Election:
Higher-numbered proposer believes it’s the new leader
Sends prepare message with higher round number
Failure Scenarios:
Actual crash of old proposer
False suspicion of crash by other nodes
Simple Explanation:

Round numbers increase when leadership changes, either due to real failures or when nodes suspect (correctly or incorrectly) that the leader has failed.

How well did you know this?

Not at all

Perfectly

Q: What does multiple active proposers (leaders) tell us about system synchrony?

A:
Key Implications:

Asynchronous Behavior:
Multiple leaders indicate system asynchrony
At least one false suspicion exists
Network or timing issues present

Normal Operation:
Should have only one leader
Non-leaders only become leaders when suspecting default leader
Multiple leaders = failure detection uncertainty

Simple Explanation:
Having multiple leaders means the system isn’t behaving in a perfectly timed (synchronous) way. It’s like having two conductors leading an orchestra - it happens when one conductor can’t see that the other is still active.

How well did you know this?

Not at all

Perfectly

Q: What is the equation for fault tolerance in Paxos and why?

A:
Equation: n = 2f + 1

where:
n = total number of replicas
f = number of failures to mask

Root Causes:
Need majority for progress
Can’t distinguish between:
Actually crashed replicas
Slow replicas
Network partitioned replicas
Why This Number:
Must prevent minority from making progress
Need majority quorum for both:
Promise messages
Learn messages
Ensures safety when partitioned

Simple Explanation:
We need more than half the replicas (majority) to be working to make decisions. This prevents split-brain scenarios where different groups might make conflicting decisions.

How well did you know this?

Not at all

Perfectly

Q: Can Paxos enter an infinite loop?

A:
Not in the algorithm itself, but in practice yes:
Causes of Infinite Loops:

Leader Competition:
Two proposers repeatedly compete
Neither gets enough time to complete

Failure Scenarios:
More than half nodes failed
Leader keeps starting new rounds
Never achieves majority
Failure Detector Interaction:
Continuous false suspicions
Repeated leader changes

Simple Explanation:
While Paxos itself can’t loop forever, the interaction with failure detection and leader election can cause continuous round changes without progress. It’s like having two people repeatedly interrupting each other before either can finish speaking.

How well did you know this?

Not at all

Perfectly

Q: Can an acceptor accept different values in different rounds?

A:
Yes, but with important constraints:
Conditions:
Can accept new value in higher round
Only if leader didn’t see previous promise
Majority acceptance ensures value preservation
Safety Guarantee:
If value V accepted by majority in round R
All future rounds will propose V
Ensures consistency
Simple Explanation:
An acceptor can change its mind in a higher round, but the protocol ensures that once enough acceptors agree on a value, that value sticks - future rounds will maintain it.

How well did you know this?

Not at all

Perfectly

Q: What is the difference between Single-Decree Paxos and Multi-Paxos?

Study These Flashcards

Key Differences:
Phase 1 Execution:
Single-Decree: Runs Phase 1 for every value
Multi-Paxos: Runs Phase 1 only on leader change
Slot Management:
Single-Decree: One instance for one value
Multi-Paxos: Multiple slots for sequence of values
Leader can propose multiple values after single Phase 1
Promise Messages:
Single-Decree: Contains single value
Multi-Paxos: Contains array of accepted values for different slots
Simple Explanation:
Multi-Paxos is like having a long-term chairman (leader) who can propose multiple items without re-election, while Single-Decree Paxos requires a new election for each decision.

Q: How does Multi-Paxos handle client requests?

Study These Flashcards

A:
Three-Layer Architecture:
Client Handler:
type clientHandler struct {
id uint32
active bool // Leader status
responseChanMap map[uint64]chan *pb.Response
}

Request Flow:
Client sends request to any replica
Only leader processes requests
Non-leaders forward to leader
Leader assigns slot number
Runs Accept phase

Response Handling:
Wait for Learn messages
Notify client when decided
Handle timeouts

Simple Explanation:
Like a restaurant where any waiter can take your order, but only the head chef (leader) decides when and how to prepare it. Other waiters forward orders to the head chef.

Q: What role does the Failure Detector play in Multi-Paxos?

Study These Flashcards

Key Components:
Leader Detection:
Monitors node health
Triggers leader changes
Uses heartbeat mechanism

Integration:
type Replica struct {
failureDetector gorumsfd.FailureDetector
leaderDetector *leaderdetector.MonLeaderDetector
}

Actions on Failure:
Suspect current leader
Trigger new leader election
Start Phase 1 if becoming leader
Block client requests during transition
Simple Explanation:
The Failure Detector is like a health monitoring system that watches all nodes. If it suspects the leader is down, it triggers the election of a new leader to keep the system running.

Q: How does Multi-Paxos ensure consistency across replicas?

Study These Flashcards

A:
Multiple Mechanisms:
Slot Ordering:
Sequential slot numbers
Leader assigns slots
Replicas process in order
Value Selection:

func (qs PaxosQSpec) PrepareQF(…) {
// Find highest round number for each slot
// Fill gaps with no-op values
// Ensure consistent ordering
}

State Synchronization:
New leader learns previous decisions
Re-proposes uncommitted values
Fills gaps with no-ops
Simple Explanation:
Like maintaining a shared ledger where pages (slots) must be filled in order, and new leaders must first understand what’s already written before adding new entries.

“If slot 5 is empty after a leader change, the new leader proposes a no-op for slot 5 before handling slot 6, preserving order.”

Q: How does Multi-Paxos handle concurrent client requests?

Study These Flashcards

Request Queuing:
clientValueQueue: make(chan *pb.Value, multipaxos.Alpha)

Slot Assignment:
Leader assigns sequential slots
Can batch multiple requests
Maintains ordering guarantees

Performance Optimizations:
Pipelining of requests
Batch processing
Concurrent Phase 2 instances

Simple Explanation:
Like a ticket system where requests are queued and processed in batches, but maintaining a strict order. The leader can work on multiple requests simultaneously but ensures they’re completed in sequence

Q: What happens during a leader change in Multi-Paxos?

Study These Flashcards

Detection and Notification:
leaderChangeChannel := r.leaderDetector.Subscribe()
case newLeaderID := <-leaderChangeChannel:
r.handleLeaderChange(ctx, newLeaderID)

New Leader Actions:
Stop accepting client requests
Run Phase 1 for all slots
Learn uncommitted values
Resume client handling

Other Replica Actions:
Update leader reference
Forward pending requests
Clear old state

Simple Explanation:
Like changing shift managers - the new leader must first understand all pending work (Phase 1), decide what needs to be redone, and then resume normal operations. Other workers need to recognize the new manager.

Q: What is Paxos and its primary use case?

Study These Flashcards

A: Paxos is a consensus algorithm for agreeing on a single value among distributed processes. It’s used in replicated state machines to order client commands consistently across servers, ensuring fault-tolerant and highly available systems.

Q: What are the three consensus safety properties in Paxos?

Study These Flashcards

Only proposed values can be chosen

Exactly one value is chosen

Processes learn chosen values only after they're committed

Q: What optimization does Paxos employ with a stable leader? (Multipaxos)

Study These Flashcards

A: The prepare phase (Phase 1) is skipped. The leader directly sends accept messages using its existing leadership authority, reducing latency by 1 round trip time.

What happends in phase1 paxos

Leader's Actions: 1. Initialization: * Chooses a new round number (also called proposal number) that is: * Unique. * Strictly greater than any previously seen round. 2. Prepare Message: * Sends Prepare(round) to all acceptors. * Asks: "What values have you accepted so far for previous rounds?" * Purpose: to gather any previously accepted values to avoid conflicts. Acceptor's Actions: 1. Promise Check: * Compares the received round with the highest round number it has seen. * If round is higher than any seen before: * Promises not to accept proposals with rounds lower than this round. * Otherwise, rejects the Prepare request. 2. Response: * Sends back a Promise message containing: * The round number of the promise. * Any previously accepted value(s) along with their rounds (if any). * This ensures the leader knows the latest accepted values. Success Criteria: * Leader must receive Promise messages from a majority (quorum) of acceptors. * Collects information about previously accepted values to determine what to propose next.

What happends in phase 2 paxos

Leader's Actions: 1. Value Selection: * If any acceptors reported accepted values in Phase 1, leader must choose the value with the highest round number. * If no values reported, leader may propose any new client value. * This ensures consistency by respecting previous accepted values. 2. Accept Request: * Sends Accept(round, value) message to all acceptors. * Uses the same round as Phase 1 to maintain continuity. Acceptor's Actions: 1. Acceptance Check: * Verifies that the round in the Accept request matches or exceeds the round number of its promise. * If valid: * Accepts the value. * Stores it as the latest accepted proposal. * If invalid: * Rejects the accept request. 2. Learn Notification: * Upon acceptance, sends Learn(slot, value) messages to learners (or the leader). * Notifies that it has accepted the value to help reach consensus. Success Criteria: * Once a majority of acceptors have accepted the same (round, value), the value is considered chosen and committed. * The system then delivers this value to the application.

Q: How do Phase 1 and Phase 2 work together in Multi-Paxos?

Q: How do Phase 1 and Phase 2 work together in Multi-Paxos? A: Key Interactions: Leadership Establishment: Phase 1 establishes leadership Done once for multiple future proposals Saves communication overhead Value Flow: Phase 1: Discovers existing values Phase 2: Proposes new values Both ensure consistency Optimization: Stable leader skips Phase 1 Only runs Phase 2 for new values Must run Phase 1 on leader change Safety Properties: Never lose committed values All replicas eventually agree Handles concurrent leaders Simple Explanation: Like electing a committee chair (Phase 1) who then can make multiple proposals (Phase 2) without re-election. If the chair is suspected of failing, a new election (Phase 1) must happen before proposals continue.

Q: What critical information does an acceptor persist?

rnd: Highest prepared round number vrnd/vval: Last accepted vote (round + value) This ensures recovery after crashes maintains safety guarantees.

Q: How does Paxos guarantee progress (liveness)?

A: Requires: Reliable leader detector (Ω) Eventually synchronous network Majority of correct processes Without these, it could enter infinite round number increases without deciding values.

Q: What's the difference between proposer and acceptor roles?

Proposers: Drive protocol phases, handle client requests Acceptors: Form quorums, persist voting state Many systems combine roles (proposer+acceptor+learner in same process).

Paxos, gorums Flashcards

(30 cards)