System Design Flashcards
(27 cards)
1
Q
Delivery Framework
A
- Requirements (5 min)
- Core Entities (2 min)
- API (5 min)
- (Optional) Data Flow (5 min)
- High Level Design (10 - 15 min)
- Deep Dives (10 min)
2
Q
Delivery Framework: Requirements
A
- Functional requirements (prioritize ~3)
- Non-functional
- (Optional) capacity estimation
3
Q
Non-functional requirements
A
- CAP Theorem (consistancy or performance)
- Environmental constraints
- Scalability (reads vs writes, hot spots)
- Latency
- Durability
- Security
- Fault Tolerance
4
Q
Metrics units
A
- Thousand = kilo
- Million = mega
- Billion = giga
- Trillion = tera
- Quadrillion = peta
5
Q
Common latencies
A
- Reading 1 mb sequentially from memory = 0.25 ms
- Reading 1 mb sequentially from SSD = 1 ms
- Reading 1 mb sequentially from spinning disk = 20 ms
- Round trip network latency CA to Netherlands = 150 ms
6
Q
Common storage
A
- 2-hour movie = 1 gb
- Small book of plain text = 1 mb
- High-resolution photo = 1 mb
- Medium-resolution image or web graphic = 100 kb
7
Q
Common domain estimtations
A
- DAUs on a social media network = 1b
- Hours of video streamed on netflix/day = 100 m
- Google searches/second = 100k
- Size of Wikipedia = 100 gb
8
Q
2 types of Scaling
A
- horizontally - adding more machines
- vertically - adding more resources to a single machine
9
Q
Requirements for horizontal scaling
A
- load balancer
- load balancer strategy (round robin, queuing system, least connections, utilization-based)
- try to partition data such that a single node has all the data it needs
10
Q
Specialized Indexes
A
- Use ElasticSearch
- Types - geospatial, vector (find image or document), full-text (search document)
- Set up ElasticSearch to index most databases using Change Data Capture (CDC)
- Drawbacks - new failure point, new source of latency, stale data
11
Q
Communication Protocols
Internally
A
HTTP(S) or gRPC
12
Q
Communicatin Protocols
With Client
A
- REST (Request -> Response)
- Long polling
- SSE (Server-Sent Events)
- Websockets (Bi-directional Channel)
13
Q
Long polling
A
- Use when need to give clients near-realtime updates
- Client makes a request and server holds the request open until it has data
- Client can then make another request
- Works with standard load balancers and firewalls
14
Q
Websockets
A
- Use when need realtime, bidirectional communication
- Challenge - must maintain many long open connections
- Common pattern to use message broker to handle communication and backend services communicate directly with message broker (centralizes connection to client)
15
Q
Server Sent Events (SSE)
A
- Use when client needs multiple updates from server
- Requires single long-lived HTTP connection
- Requires less specialized infrastructure than websockets
16
Q
Security
A
- API Gateway for Authentication/Authorization
- Encryption
- Don’t pass userId or things like that through endpoints or bodies, should be in headers
17
Q
Search Optimized Database
A
- Allows for full text search using indexing, tokenization, stemming
- Inverted Index - index from word to document
- Can confiure if fuzzy search is allowed
- ElasticSearch
18
Q
API Gateway
A
- Routes requests to correct microservice
- Authentication
- Rate limiting
- Logging
19
Q
Load Balancer
A
- Need a load balancer whenever you have multiple machines capable of handling the same request
- Can leave out of box and pointer and just mention
- AWS Elastic Load Balancer
20
Q
When to use a Queue
A
- Buffer for bursty traffic
- Distribute work across a system
If strong latency requirements (< 500 ms), queue will probably exceed
21
Q
Queues
A
- Message Ordering - typically FIFO but can be priority
- Retry configurations
- Dead letter queue for debgging/auditing
- Scaling with partitions (requires partition key)
- Backpressure to slow down producers
AWS SQS
22
Q
When to use a Stream
A
- Process large amounts of data in real-time (think analytics dashboard)
- Support complex processin scenarious like event sourcing (think transactions at a bank)
- Support multiple consumers reading from the same stream (think chat room)
23
Q
Streams
A
- Scaling with Partitioning
- Multiple consumers
- Replication
- Windowing
Kinesis
24
Q
Distributed Lock
A
- Need to lock a resource for a period of time (maybe 10 min)
- Use distributed key-value store like Redis to create a hash map of item -> lock.
- Only one system or process can lock the particular item at a time
- Can set an expiration on the lock so if process crashes, item doesn’t get stuck in locked state
Think: item in inventory while in cart, assignment of driver to rider
25
Cache Eviction Policy
* Least Recently Used
* FIFO
* Least Frequently Used
26
Cache Write Strategy
* Write-through cache - writes data to both cache and database simultaneously
* Write-around cache - just writes to database (caches on next get)
* Write-back cache - writes to cache and hen asynchronously to DB (may lose data)
| Redis
27