System Design Flashcards
(52 cards)
TCP v. UDP
TCP
Transport layer – accuracy > speed
Connection-oriented – client and server must be connected before data sent
stateful protocol – can detect errors
UDP
speed > accuracy
real time service but some delays
HTTP and HTTPS
defines method requests, addresses, default ports
works on top of Transport Layer Security
TLS Handshake
client sends a request
servers submits a digital certificate
if certificate accepted by the client, client generates a session key to encrypt info transmitted during the session
handshake finishes, session begins
websocket
unprovoked server send!
server send data to clients without receiving a request first
Messages to be passed back and forth
use case: real-time data. up-to-date info is critical
transport layer
tcp – accuracy > speed
udp – speed > accuracy (video streaming)
retries
fail fast - low limit and alert user
risk thundering herd
jitter to introduce randomness in reqs
circuit breakers
opens when problem is detected
prevent cascading failures when shared resource goes down
rate limiting
cap usage, prevent autoscaling > budget. control reqs by customer
token bucket
leaky bucket
fixed and sliding window
queue based load leveling
order tasks in queue when they are concurrently requesting a service
introduces latency
good for scenarios when latency is ok and order matters
gateway aggregation
put a gateway in front of backend to aggregate and then dispatch requests.
Risk point of failure.
load balancing methods
round robin, least connections, consistent hashing
load balancing industry standard
nginx, amazon elb
load balancing pros
reliability, scalability, performance
load balancing risks
bottleneck
need to share session data across backends
longer deploys
scalable systems features
reliability (retries)
availability (rate limiting)
load balancing
sql db pros
relational - foreign keys
SQL querying language
structured data
ACID compliant - all or nothing transactions
sql db cons
hard to scale write-heavy systems
more work to define schema
harder to store unstructured data
nosql db pros
good for unstructured data
key-value pairs stored in docs
good for scaling -> support heavy write and read systems
nosql db cons
eventual consistency
harder to query multiple tables
types of db sharding
geo sharding
range sharding (first letter)
hash sharding
sharding pros
more scalable
faster queries with indexing
one shard downtime won’t affect all
reduce hardware costs
sharding cons
not all data can be sharded
foreign key reltns only maintained within a single shard
table joins very expensive
analytics
batch processing
web crawling
batch processing