System Design '23 Flashcards

(63 cards)

1
Q

What distributed problems does a Load Balancer solve?

A

Server hotspot and Parallel Requests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What problems can be solved by Data replication?

A

Geo location, server hotspot, parallel requests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What problems can Data Sharding solve?

A

Parallel requests, Data size limitations, and data hotspots.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What problems can caching solve?

A

Data Hotspots, Parallel Requests, Geolocation problems,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s the format for writing an API on the whiteboard?

A

POST /order
request:
{
“user” : user_id
“quantity” : number
“item” : barcode
“date” datetime
}
response:
{
…data
}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a 401 code?

A

Not Authorized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What code is use for Not Authorized?

A

401

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does a 500 Response code mean?

A

Internal Service Error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What’s a 403 mean?

A

Forbidden - valid formatted credentials but not authorized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the diff between 401 and 403 response codes?

A

401 means no or invalid credentials supplied. 403 means valid credential but user isn’t authorized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What type of database do you use with any kind of file storage?

A

Blob, or S3 storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do you need to mention when file transfers are mentioned?

A

Chunking, or torrent protocol. Break the file transfer into chunks and store them one at a time. Server can still be stateless and just map out the necessary file chunks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe Linearizability?

A

This is how you achieve strong consistency, as in all transactions are read dealt with as if in sequential order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the cost of Linearization on a distributed system?

A

Availability, there will be latency in sending each transaction to all the replicas before allowing them to process a subsequent transaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a serialization problem on a non distributed system?

A

When parallel transactions occur, how do you make sure one happens in order? Serially.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does Blob stand for?

A

Binary Large object store.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does serialization mean for?

A

Trxs happening in a sequential order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What’s the difference between web sockets and polling?

A

Polling repeatedly polls the network for a data change. Web sockets have a two way open communication and can each speak to each other when needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is session affinity?

A

When a LB stores a session and sends user requests to the same backend server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is gRPC used for?

A

It can be used in internal microservices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What 6 main api architectures?

A

Soap, gRPC, REST, GraphQl, Web sockets, webhooks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What’s a good usecase for web sockets?

A

Real time updates (Uber, delivery status, chat app)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 3 main standards for Event Driven APIs?

A

Webhooks, WebSockets, HttpStreaming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What type of connection is gRPC/RPC good for?

A

Microservice communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are 3 use cases for consistent Hashing?
Distributed Cache, Load Balancing, Distributed databases
26
How is consistent hashing achieved?
Hash Ring
27
What are the 4 components of Torrent Protocol?
1. Tracker - central server to track pieces 2. Selection Algo - choose the rarest chunks first 3. Swarm - group of peers hosting files 4. Seeding - Once you download you seed.
28
What's the relationship of Torrent Protocol to chunking?
Torrent protocol utilizes chunking, but chunking can be used in any file transfer
29
What benefits do Web Sockets provide over Http requests?
Http request has 2-3 KB of data for each request. Websocket has an initial overhead, but then each subsequent request is only a few bytes overhead.
30
What are some use cases for Web Hooks?
Event Driven Actions, Asynchronous processing, Data synchronization, and decoupled architecture
31
What is 100Mb * 10 KB?
1,000,000,000,000 = 1 TB
32
What is 500,000 * 10KB?
5,000,000,000 5GB
33
What is 1,000,000/day in x per second?
1000000/100000 = 10/second
34
How many seconds in a day? What do we round to?
86,400. Round to 100,000
35
When is an API Gateway needed?
When many microservices exist in an architecture, the gateway can be a uniform entrypoint for users?
36
What is 1 MB * 1 MB?
1 TB
37
What is 1 KB * 1 KB
1 MB
38
What's a clustered index?
DB index as part of the database itself
39
What's a non clustered index?
DB Index outside of the main table.
40
What's the diff between web hook and a normal http connection?
1. web hooks are Server initiated, often server-server communication 2. web hooks are asynchronous 3. Web hooks are more of a notification or one-way communication rather than waiting for a response.
41
What are the 4 ways of writing to cache and a backend DB?
Write Through, Write Back (Later), Write Around, Read Through
42
Describe Write Through
Data is written to the db at the same time it is written to the cache.
43
Describe Read Through
Data is written to the backend, but only put in the cache when it is retrieved later.
44
What does WebRTC solve? What problem does this cause?
Low latency live broadcasts. Harder to do adaptive bitrates.
45
What protocol does Zoom like streaming use?
WebRTC?
46
What might twitch use to handle both live streaming and user interactions?
HLS for the streaming and websockets for the interactions.
47
Describe Write Around
Data written directly to backend skipping cache all together.
48
Describe Write Later
Data is kept in cache and only written to the DB later when there is more processing cpu available.
49
What's the difference between session persistence and session affinity?
Both mean the LB routes the user to the same backend server. Persistence means session state might even be maintained on the backend server.
50
What are sticky sessions?
An implementation of Session affinity where the user is routed to the same backend server by ip or another address
51
What is centralized session storage?
52
What's the term for round robin that takes capacity into account?
Weighted Round Robin
53
What's a good way to ensure a user's shopping cart is the same across all servers?
Centralized session storage. Any backend server can access the storage.
54
What is Dash? What are some companies that use it?
Http Dash is a video stremaing protocol, used by popular companies like netflix, hulu, amazon prime, disney.
55
What does dash stand for?
"Dynamic Adaptive Streaming over HTTP."
56
1 Billion requests per day is how many per second?
10,000
57
500 million requests per day is how many per second?
500 million/100,000 = 5K/second
58
How does Redis handle leader/follower?
Redis Sentinel and/or Cluster.
59
What does Redis use for sharding and partitioning data?
Redis Cluster.
60
If a cache needsto be distributed with high availability what is a good tool?
Redis Cluster
61
What is Redis Sentinel used for?
Replicating data, monitoring nodes, promoting a follower to a leader when a failure goes down.
62
What is the main limitation of Hadoop? What was a solution to that?
It can be accessed only sequential. The solution was HBase which can be accessed randomly.
63
Is it preferable to use PATCH or PUT?
PUT, it is less error prone because you don't have to decide what to do with omitted values (whether to delete them)