System Design Flashcards
Delivery Framework
- Requirements
- Core Entities
- API or System Interface
- Data Flow
- High Level Design
- Deep Dive
Requirements
- Functional Requirements - “Users/clients should be able to…” Top 3
- Non-functional Requirements - “System should be / should be able to…” Top 3
- Capacity Estimations
Nonfunctional Requirements Checklist (8)
- CAP theorem, but for distributed systems really just CA, P is a given
- Environment constraints, ie battery life or limited memory
- Scalability, unique reqs such as bursts traffic or read/write ratio
- Latency, specifically for anything with meaningful computation
- Durability, how important that data is not lost
- Security, ie data protection, access control
- Fault tolerance, ie redundancy, failover, recovery mechanisms
- Compliance, ie legal or regulatory requirements or standards
Bytes to store data
ASCII - 1 byte
Unicode - 2 bytes
Split seconds
Millisecond (ms) 1/1000
Microsecond (us) 1/1,000,000
Nanoseconds (ns) 1/1,000,000,000
Read latency
Memory
1mb/.25ms, 4gb/s
SSD (4x memory)
1mb/ms, 1gb/s
Disk (20x SSD)
1mb/20ms
Worldwide trip
6/s
Request Calculations by second
2.5 mil seconds per year
1 million per month = .4/s
2.5 million per month = 1/s
10 million per month = 4/s
100 million per month = 40/s
1 billion per month = 400/s
Storage estimates:
2 hr movie
Small plain text book
High res photo
Med res image
Movie 1gb
Book 1mb
Photo 1mb
Med res image 100kb
DB Writes vs Reads
Write is 40x more expensive than read
Core Entities
2 min.
What the API will exchange and will persist in data model. Ex user/tweet,follow for twitter.
Bullet list
API or System Interface
RESTful or GraphQL
Endpoints with path and parameters
Data Flow
Actions or processes that the system performs on the input to produce the desired outputs
Core Concepts
Scaling - work distribution and data distribution
Consistency
Locking
Indexing
Communication Protocols
Security - authentication and authorization, encryption, data protection
Monitoring - infrastructure, system level, application level
Key Technologies
Core DB
Blob storage
Search optimized DB
API gateway
Load balancer
Queue
Streams / event sourcing
Distributed lock
Distributed cache
CDN
Patterns
DB backed CRUD with caching
Async job worker pool
2 stage architecture
Event driven architecture
Durable job processing
Proximity based services
Core API - high level overview
“Our Core API uses a layered .NET architecture, deployed in EKS. Controllers
handle HTTP routing, Services handle business logic, and a Data layer interacts
with Aurora and Redis. This lets us scale the service horizontally while keeping
the codebase maintainable.”
Core API - layered architecture justification
“We wanted to separate concerns—controllers focus on HTTP requests, services
encapsulate domain rules, and our data layer deals with Aurora and caching. This
approach cuts down on coupling and makes it easier to adapt or extract
microservices down the road.”
Core processor - explanation
“We have a central ETL pipeline—the Core Processor—which ingests data from
multiple providers, stores raw payloads in S3, and then transforms/loads it into Aurora.
Tasks run on a cron based scheduler and it retries on failure with exponential backoff, ensuring resilience even if a provider is temporarily down”
Core API - why K8s?
“Kubernetes gave us automated scaling and rolling updates out of the box. We
can spin up more pods during major sporting events and scale back when traffic is
low, all while ensuring near-zero downtime.”
Core API - EKS rolling updates
“We use a rolling update strategy so that when deploying a new version of the
API, only one old pod goes down at a time—our system stays online, and if
something fails, we can roll back quickly.”
Core API - stateless pods
“Even though our application manages a lot of data, we designed each pod to be stateless. Any persistent data—sessions, user info, or stats—resides in Aurora, Redis, or S3.
That means losing a pod doesn’t risk losing data.”
Core API - Ingress and Helm templating
“We have an internal ALB that terminates TLS and checks liveness via /health.
The ALB is configured via Ingress annotations in our Helm chart, ensuring only
healthy pods receive requests. We define everything in Helm charts, from replicas
and resource limits to Ingress rules. Environment-specific overrides like values-
stage.yaml and values-prod.yaml let us run the same code in staging vs.
production with minimal overhead.”
Core API - CI/CD pipeline
“We use CircleCI to build Docker images, run tests, push the image to ECR, then
automatically update our Helm chart. If linting or validation fails, the deployment
never proceeds—meaning we catch issues before they hit production.”
Core API - automatic rollbacks
“Our pipeline can roll back a Helm release if we detect a spike in 500 errors or
failing health checks. That safety net lets us move fast and confidently ship
updates.”