Scalability (L12) Flashcards

Question 1

Q

What is Scalability in distributed systems?

Answer

A

Scalability is a system’s ability to handle growing amounts of work or users without suffering a drop in performance. A scalable system can expand its resources (like CPU, memory, or network capacity) to keep up with increasing demand.

Question 2

Q

What are the three dimensions of scalability mentioned in the lecture?

Answer

A

● Size Scalability: Can the system handle more users or data?
● Geographical Scalability: Can the system operate efficiently across large distances?
● Administrative Scalability: Can multiple organizations or teams manage their own parts without conflict?

Question 3

Q

What are the common root causes for scalability problems in centralized solutions?

Answer

A

● CPU limitations: One machine can only do so much.
● Storage and I/O bottlenecks: Disk or database speed becomes a constraint.
● Network congestion: Too many users talking to one central server overloads the
network.

Question 4

Q

Name three fundamental techniques for scaling distributed systems.

Answer

A

Hide communication latencies: Use async requests or local caching to avoid waiting.
Replication and Caching: Store copies of data closer to users.
Modularization and Decomposition: Break the system into smaller, independent parts
that can scale separately.

Question 5

Q

What is Modularization, and what are its general design principles?

Answer

A

Modularization means breaking a system into well-defined parts (modules), each handling one
responsibility. Key principles:
● Explicit Interfaces: Make dependencies visible.
● Low Coupling: Keep modules loosely connected.
● Small Interfaces: Don’t expose more than necessary.
● High Cohesion: Each module should do one thing well.

Question 6

Q

Differentiate between “Scaling Up” and “Scaling Out”.

Answer

A

● Scaling Up (Vertical): Upgrade a single machine (e.g., more CPU or RAM).
● Scaling Out (Horizontal): Add more machines or servers to handle the load.

Question 7

Q

What are the three further “Scaling Out” dimensions discussed?

Answer

A

● Functional Decomposition: Split the app into smaller pieces (e.g., microservices).
● Partitioning (Sharding): Divide the data across servers.
● Duplication: Clone services and balance the load between them.

Question 8

Q

What is Functional Decomposition in practice?

Answer

A

It’s the process of breaking a monolithic application into independent, focused services (like microservices), which communicate through APIs.

Question 9

Q

What is Partitioning (Sharding)?

Answer

A

Sharding means splitting a dataset into smaller chunks, stored on different nodes, so no single node is overloaded.

Question 10

Q

What are two common methods for Partitioning?

Answer

A

● Range Partitioning: Data is split based on sorted key ranges (e.g., A–F, G–M…).
● Hash Partitioning: A hash function spreads data evenly across shards.

Question 11

Q

What is a Distributed Hash Table (DHT)?

Answer

A

A DHT is a peer-to-peer structure that maps keys to nodes using a consistent hash function. Nodes form a ring, and each one stores a portion of the key-value space. DHTs are scalable and resilient to node changes.

Question 12

Q

What is Duplication (Multi-tenancy) in the context of scalability?

Answer

A

Duplication means running multiple identical instances of a service to increase capacity. A load balancer distributes incoming requests. Platforms like AWS Auto Scaling automate this based on real-time demand.

Question 13

Q

What is a Metric in system performance monitoring?

Answer

A

A metric is a numeric value (like requests per second) tracked over time. Metrics often include labels (e.g., service=”auth”) to filter or group them in dashboards.

Question 14

Q

What are Quality of Service (QoS) attributes?

Answer

A

QoS attributes define how well a system performs from the user’s perspective. Examples:
● Response Time
● Throughput
● Availability
● Reliability
● Security
● Scalability
● Extensibility

Question 15

Q

What is a Service-Level Indicator (SLI)?

Answer

A

An SLI is a quantitative measurement of how a service behaves. For example, “99.9% of requests responded in under 200ms” is an SLI.

Question 16

Q

What is a Service-Level Objective (SLO)?

Answer

Study These Flashcards

A

An SLO defines a target for an SLI—what level of performance you aim to maintain. For
example, “Uptime should be at least 99.99%”.

Question 17

Q

What is the Utilization Law in performance analysis?

Answer

Study These Flashcards

A

It measures how busy a resource is. Formula:
● U = B / T, where B is busy time and T is total time
● Or U = X × S, where X is throughput, and S is service time

Question 18

Q

How is Throughput (X) calculated?

Answer

Study These Flashcards

A

Throughput is how much work gets done per unit of time. Formula: X = C / T, where C is the number of completions and T is the total time.

Scalability (L12) Flashcards

(18 cards)