System Design Flashcards

(101 cards)

1
Q

7 steps of systems design problems

A
  1. Requirements clarification
  2. Back-of-the-envelope estimation
  3. System interface definition
  4. Defining data model
  5. High-level design
  6. Detailed design
  7. Identifying and resolving bottlenecks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Step 1: Requirements Clarification

A
  • Determine exact scope
  • Define end goals of system
  • Clarify which parts of system to focus on (e.g. back-end vs. front-end)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Step 2: Back-of-the-envelope estimation

A
  • Estimate and quantify scale of system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Step 3: System interface definition

A

Define what APIs are expected from the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Step 4: Define data model

A
  • How will data flow between components?
  • How will different entities interact with each other?
  • How will we partition and manage data? Specific choices:
    • Which database system should we use? NoSQL vs SQL?
    • What kind of block storage should we use to store files? (e.g. multimedia)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Step 5: High-level design

A
  • Draw a block diagram representing core components to solve the actual problem from end-to-end.
  • Possibly describe the system verbally or type out in some kind of list format
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Step 6: Detailed design

A
  • Dig deeper in to 2-3 major components, guided by interviewer feedback.
  • Consider tradeoffs between different approaches.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Step 7: Identifying and resolving bottlenecks

A
  • Identify any single points of failure and discuss mitigation
  • Discuss redundancy and backup plans for data and services
  • Discuss performance monitoring
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

key characteristics of distributed systems

A
  • Scalability
  • Reliability
  • Availability
  • Efficiency
  • Serviceability or Manageability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Scalability

A

capability of a system, process, or network to grow and manage increased demand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Horizontal vs vertical scaling

A
  • Horizontal is easier to scale dynamically by adding machines
  • Vertical scaling is upper limited and may involve downtime
  • Horizontal scaling examples: Cassandra, MongoDB
  • Vertical scaling examples: MySQL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reliability

A
  • Probably a system will fail in a given period
  • Distributed system: keeps delivering services with one or several component failures
  • Availability over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Availability

A
  • Time a system remains operational over a specific period

* Accounts for maintainability and repair

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Efficiency

A
  • Latency / response time to requests (correlates to number of messages)
  • throughput / bandwidth (correlates to size of messages)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Serviceability / manageability

A
  • Ease to operate and maintain
  • Simplicity and speed of repair (as it increases, availability/reliability decrease)
  • Considerations: ease of diagnostics, ease of updates
  • Automated fault detection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Load balancer

A
  • component to spread traffic across a cluster of servers
  • improves responsiveness and availability
  • crucial to horizontal scaling
  • Ensure health of chosen server
  • Select a healthy server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Load balancing placements

A
  • Between client and web server
  • between web servers and internal layer (app servers)
  • between internal layer and database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Load balancer: least connection method

A
  • directs traffic to server with fewest connections

* good for large number of persistent connections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Load balancer: least response time method

A
  • directs traffic to the server with the lowest response time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Load balancer: least bandwidth method

A
  • selects the server that is currently serving the least amount of traffic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Load balancer: round robin method

A
  • cycles through available servers and sends each request to the next server
  • good for servers of equal specs and few persistent requests
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Load balancer: weighted round robin method

A
  • round robin but with weights on different servers based upon processing capacity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Load balancer: IP hash method

A
  • client IP address is hashed and servers are each assigned blocks of hashes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Load balancer redundancy

A
  • can be a single point of failure
  • can add more LBs to form a cluster of active/passive instances
  • clustered LBs monitor each other and passive takes over if active fails
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Caching
* locality of reference principle: recently requested data is likely to be requested again * often implemented near front end to reduce downstream traffic
26
Application server cache
* cache placed directly on request layer, check if each request is in the cache before fetching from disk * can have multiple layers, e.g. node memory -> node disk (still faster than network)
27
Content Delivery Network (CDN)
* cache for sites serving large amounts of static media
28
Cache invalidation
* maintenance to keep cache consistent with database when data changes
29
Write-through cache
* Data is written into the cache and DB simultaneously * Maximizes consistency, minimizes risk of loss * Higher latency as a result of double write operations
30
Write-around cache
* Data is written directly to permanent storage, bypassing the cache * avoids flooding the cache with writes * increases chance of cache misses, which must then be read from back end with higher latency
31
Write-back cache
* Data is written to cache alone * Write to permanent storage is done in intervals or chunks * Lowest latency, highest throughput for write-intensive apps * Risk of data loss due to crash since there is no cache backup
32
Cache eviction policies (6)
* FIFO - evicts oldest block first * LIFO - evicts newest block first * LRU - evicts least recently used items first * MRU - evicts most recently used items first * LFU - evicts least frequently used items first * RR - random replacement
33
Importance of estimation
* important later for scaling, partitioning, load balancing, and caching
34
Examples of system parameters to estimate
* Examples of things to quantify: number of actions, amount of storage, expected network bandwidth usage
35
Components to include in high level design
*Clients, load balancing, application servers, databases, file storage
36
Examples of detailed design topics
* How will we partition data types between multiple databases? * How should we optimize data storage further (e.g. recency)? * How much and at what layer should we implement caching? * Which components need load balancing?
37
Horizontal scaling
Horizontal add more servers in to your resource pool
38
Vertical scaling
Adding more power/capacity to an existing server
39
Cache misses
when request data is not found in the cache
40
Cache invalidation: 3 main schemas
* write-through * write-around * write-back
41
CDN for smaller systems
* for smaller systems, we can design for future transition to a CDN with a separate subdomain for static media
42
Data partitioning
technique to break up a big database in to many smaller parts across multiple machines
43
Benefits of data partitioning
Improves the manageability, performance, availability, and load balancing of an application
44
Justification for data partiioning
After a certain scale point, it is cheaper and easier to scale horizontally by adding machines than it is to grow vertically
45
Popular data partitioning methods/schemes
* Horizontal partitioning * Vertical partitioning * Directory-based partitioning
46
Horizontal partitioning (range-based partitioning) (data sharding)
Putting different rows in to different tables based upon range of a certain value
47
Main risk with horizontal partitioning
If the value chosen for partitioning isn't evenly distributed, then the scheme will lead to unbalanced servers
48
Vertical partitioning
* divide data to store tables related to a specific feature in their own server * different types of data in different servers
49
Main risk with vertical partitioning
If the app grows, we may need to further/horizontally partition a feature-specific database
50
Directory Based Partitioning
* loosely coupled approach * create a *lookup service* which knows your current partitioning scheme * separates partitioning from the DB access code * functionality: query the directory server which holds the mapping between key and DB server
51
Directory based partitioning benefits
* because it is loosely coupled, we can add servers to the DB pool or change the partitioning scheme without impacting the application
52
Key or hash-based partitioning
* apply a hash to some attributes of the entity we are storing, yielding a partition number * problem: effectively fixes the total number of DB servers - workaround is to use consistent hashing
53
List partitioning
* each partition is assigned a list of values | * check each record against the list and store it in the relevant partition
54
Round-robin partitioning
* ensures uniform data distribution by rotating data assignment between partitions
55
Composite partitioning
* combination of any partitioning schemes to devise a new scheme * e.g. first applying list partitioning then hash based
56
Consistent hashing
* hash-based approach that handles adding/removing servers * hash the objects and the servers randomly to a unit circle * hash(o) mod(360) * assign each object to the next server in the circle in clockwise order
57
Hash-based partitioning algorithm
* in a system with n servers, place object o in server with id hash(o) mod n
58
Consistent hashing benefits
If a server fails: only objects mapped to the failed server need to be reassigned to the next server clockwise If a server is added: only objects mapped to the new server need to be moved In either case, most objects maintain their prior assignments
59
Partitioning issues: joins and denormalization
Joins are often not feasible across partitions Common workaround is to denormalize the DB by adding redundant copies across multiple databases Denormalization downside is increased risk of data inconsistently
60
Partitioning issues: Referential integrity
Most RDBMS do not support foreign keys across DBs on different servers Apps that require referential integrity across partitioned DBs often have to enforce it in application code
61
Partitioning issues: rebalancing
Rebalancing is difficult without incurring downtime since we have to move resources across partitions Directory-based partitioning can make rebalancing easier at the cost of increased system complexity and a new single point of failure on the lookup service
62
Database indexing
Create an index on particular DB tables to make it faster to search. Index can be created across one or more columns and includes a pointer to the full row
63
Indexing use cases
* Finding a small payload in a large dataset
64
Indexing performance impacts
* Can speed up retrieval * Increased write time * Increased storage requirements
65
Proxy server
* intermediate server between the requests from clients and the servers that handle those requests
66
Proxy server use cases
* Filter, log, and transform requests | * Can serve requests from its cache to reduce downstream load
67
Open proxy
proxy server accessible to any internet user
68
Reverse proxy
Proxy within an internal network, not visible to the client. Can include load balancing, caching, security to protect internal servers from direct access
69
Forward proxy
Proxy managed by a client to handle requests to an external server
70
Redundancy
Duplication of critical components or functions in a system to increase the reliability or improve performance
71
Replication
Sharing information to ensure consistency between redundant resources. Often used in DBMS where updates are written to a primary server which then passes it to the replica servers
72
Relational database (SQL)
* Structured with predefined schemas * Each row contains information about an entity * Each column contains a particular point of data
73
Non-relational database (NoSQL)
* Unstructured * easily distributed * dynamic schema
74
Common types of NoSQL DBs
* Key-value stores * Document DBs * wide-column databases * graph databases
75
Key-value store and examples
* Stores an array of key-value pairs | * Examples: Redis
76
Document databases
* Data is stored in documents which are grouped in to collections * Each document can have a unique structure * Example: MongoDB
77
Wide-column database
* Instead of tables, uses column families which are containers for rows * Don't need to know all the columns up front * Each row can have different numbers of columns * Best for analyzing largedatasets * Examples: Cassandra, HBase
78
Graph database
* Store data graph relations * Data saved in the form of nodes and their properties, and lines/connections between nodes * Examples: Neo4J
79
SQL vs NoSQL: Storage
SQL: data stored in tables where each row represents an entity NoSQL: variety of storage models
80
SQL vs NoSQL: Schema
SQL: each record conforms to a fixed/predefined schema. Higher referential integrity. NoSQL: schemas are dynamic and changeable
81
SQL vs NoSQL: querying
SQL: uses SQL to manipulate and precisely retrieve subsets of data NoSQL: queries are focused on retrieving collections of full records
82
SQL vs NoSQL: scalability
SQL: vertically scalable, difficult to scale horizontally without duplication or indexing NoSQL: highly horizontally scalable since referential integrity is less of a priority
83
SQL vs NoSQL: ACID compliance
ACID: Atomicity, Consistency, Isolation, Durability SQL: usually ACID compliant therefore more reliable NoSQL: sacrifices ACID compliance for performance and scalability
84
CAP Theorem
States that it is impossible for a distributed system to simultaneously provide more than 2/3 of the following: * consistency, availability, and partition tolerance
85
Consistency
Every read retrieves the most recent write
86
Availability
Every request receives a response
87
Partition tolerance
The system continues to work despite message loss or partial failure. Data is sufficiently replicated across nodes and networks to handle intermittent/partial outages
88
standard HTTP web request sequence of events
1. Client opens a connection and requests data from server 2. The server calculates the response 3. The server sends back the response
89
AJAX (asynchronous Javascript) polling
Client repeatedly polls a server for data
90
Long-polling (Hanging GET)
Client requests information from the server, server holds the request open and waits until data is available
91
WebSockets
Persistent connection between a client and server over a single TCP connection * connection established via a WebSocket handshake * allows for real-time data transfer
92
Server side events (SSEs)
* Clients establish a persistent, long term connection to the server * Server uses this connection to send data to a client
93
storage capacity model
margin of extra storage above anticipated needs
94
REST API
REpresentational State Transfer
95
4 possible API actions (CRUD) and associated methods
* Create (POST) * Read (GET) * Update (PUT/PATCH) * Delete (DELETE)
96
HTTP Request headers
optional set of key value properties shared from client to server
97
JSON document
JavaScript Object Notation. Document in a key value format frequently used for data transfer via REST APIs
98
Ways to authenticate
* basic authentication (username/password) | * secret token (e.g. oAuth)
99
80-20 rule
20% of requests generate 80% of traffic
100
Key generation service (KGS)
* Service to generate random keys in advance of requests | * Removes risk of duplicates/key collisions
101
Purging/DB cleanup considerations
* storage is cheap, low cost of storing things for a long time * searching for expired data can be costly * active cleanup logic should consider app traffic * passive cleanup logic could recognize and delete expired data on request