Systems Design Flashcards

1
Q

Vertical Scaling

A

Adding more power (CPU, RAM) to servers. Think vertical like turning up the volume.

Inherently limited (you can’t add CPU forever) and doesn’t offer redundancy, since if you have one server with a ton of RAM it’s still just one server and you’re SOL if it goes down.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Horizontal Scaling

A

Adding more servers. Think horizontal like a row of servers all lined up.

This is the preferred approach for most large scale organizations, since it offers redundancy and unlimited scale. You can always add more servers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Load balancer

A

Evenly distributes incoming traffic amongst web servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Database Replication

A

A strategy to support failover and redundancy on the database level

  • Primary/replica relationship amongst the databases
  • Only the primary receives writes
  • Replicas get copies from the primary and only support reads
  • Most databases more reads than writes, so there are usually many replicas to one primary
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What happens if a primary database goes down?

A

You promote a replica to be the primary. That db will immediately start receiving writes. You will likely have some data loss from whatever the lag was between the old primary and replica (example, if the replica copies data from the primary every 10 minutes, then you will have lost that 10 mins of data and need to recover it)

To mitigate that at GitHub we used MySQL’s semi-synchronous replication. A primary does not acknowledge a transaction commit until the change is known to have shipped to one or more replicas. It provides a way to achieve lossless failovers: any change applied on the primary is either applied or waiting to be applied on one of the replicas.

Consistency comes with a cost: a risk to availability. Should no replica acknowledge receipt of changes, the primary will block and writes will stall. Fortunately, there is a timeout configuration, after which the primary can revert back to asynchronous replication mode, making writes available again.

We set our timeout at a reasonably low value: 500ms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cache

A

a temporary storage area that stores frequently accessed data in memory so subsequent calls can happen more quickly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe githubs tech stack

A

MySQL databases, Redis and Aqueduct for queues, and Redis/Memcached for ephemeral caches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Memcache

A

In-memory, key-value store for caching. Designed for simplicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Redis

A

In-memory, key-value store for caching. More feature rich than memcache. Offers things like advanced data structures, pub/sub, replication, etc that Memcache does not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Name 5 things to consider when caching?

A
  • When to use caching. I.E. For data that’s read frequently but written infrequently
  • When to expire the cache
  • Consistency. Keeping the cache and data store in sync
  • Mitigating Failures. You need multiple cache systems across different data centers so there’s no single point of failure
  • Eviction. If the cache is full, what should you remove to make room for new things? Least Frequently Used is the most common strategy. Followed by First in First Out.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is stateful architecture?

A

When the users session data is stored on a particular server. So, the first time you visit you hit Server A so for every subsequent visit to take advantage of the cache you also have to be routed to Server A.

Like when you go to check in for a race and you have to go to the line designated for your last name. None of the other check in stations will have your information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is stateless architecture?

A

When HTTP requests for a given user can be sent to any server, because they all fetch the session data from a shared data store.

Instead of having to go to your check-in line in a race, you can go to whichever line is shortest, because all of the check-in booths or checking people in on an iPad.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly