Systems Flashcards
What are some common bottlenecks scaling up a web service?
Scaling the Database
CPU Bound Application
Architecture Bottlenecks
IO Bound Application
What is a REST API?
REST, or REpresentational State Transfer, is an architectural style for providing standards between computer systems on the web, making it easier for systems to communicate with each other. REST-compliant systems, often called RESTful systems, are characterized by how they are stateless and separate the concerns of client and server.
Trade-offs to consider regarding storage
Storage is about holding information. Any app, system, or service that you program will need to store and retrieve data, and those are the two fundamental purposes of storage.
- the shape (structure) of your data
- what sort of availability it needs (what level of downtime is OK for your storage)
- scalability (how fast do you need to read and write data, and will these reads and writes happen concurrently (simultaneously) or sequentially) etc, or
- consistency - if you protect against downtime using distributed storage, then how consistent is the data across your stores?
Define latency
Latency is simply the measure of a duration. What duration? The duration for an action to complete something or produce a result. For example: for data to move from one place in the system to another. You may think of it as a lag, or just simply the time taken to complete an operation.
Define throughput
This can be understood as the maximum capacity of a machine or system. It’s often used in factories to calculate how much work an assembly line can do in an hour or a day, or some other unit of time measurement.
What are SLAs?
Service Level Agreements/Assurances
In order to make online services competitive and meet the market’s expectations, online service providers typically offer Service Level Agreements/Assurances. These are a set of guaranteed service level metrics. 99.999% uptime is one such metric and is often offered as part of premium subscriptions.
How to design a high availability system?
When designing a high availability (HA) system, then, you need to reduce or eliminate “single points of failure”. A single point of failure is an element in the system that is the sole element that can produce that undesirable loss of availability.
You eliminate single points of failure by designing ‘redundancy’ into the system. Redundancy is basically making 1 or more alternatives (i.e. backups) to the element that is critical for high availability.
What are relational databases?
A relational database is one that has strictly enforced relationships between things stored in the database. These relationships are typically made possible by requiring the database to represented each such thing (called the “entity”) as a structured table - with zero or more rows (“records”, “entries”) and and one or more columns (“attributes, “fields”).
By forcing such a structure on an entity, we can ensure that each item/entry/record has the right data to go with it. It makes for better consistency and the ability to make tight relationships between the entities.
What are ACID transactions?
ACID transactions are a set of features that describe the transactions that a good relational database will support. ACID = “Atomic, Consistent, Isolation, Durable”. A transaction is an interaction with a database, typically read or write operations.
What does the A in ACID stand for?
Atomicity requires that when a single transaction comprises of more than one operation, then the database must guarantee that if one operation fails the entire transaction (all operations) also fail. It’s “all or nothing”. That way if the transaction succeeds, then on completion you know that all the sub-operations completed successfully, and if an operation fails, then you know that all the operations that went with it failed.
What does the C in ACID stand for?
Consistency requires that each transaction in a database is valid according to the database’s defined rules, and when the database changes state (some information has changed), such change is valid and does not corrupt the data. Each transaction moves the database from one valid state to another valid state. Consistency can be thought of as the following: every “read” operation receives the most recent “write” operation results.
What does the I in ACID stand for?
Isolation means that you can “concurrently” (at the same time) run multiple transactions on a database, but the database will end up with a state that looks as though each operation had been run serially ( in a sequence, like a queue of operations). I personally think “Isolation” is not a very descriptive term for the concept, but I guess ACCD is less easy to say than ACID.
What does the D in ACID stand for?
Durability is the promise that once the data is stored in the database, it will remain so. It will be “persistent” - stored on disk and not in “memory”.
What are non relational databases?
In contrast, a non-relational database has a less rigid, or, put another way, a more flexible structure to its data. The data typically is presented as “key-value” pairs.
NoSQL database properties are sometimes referred to as BASE:
Basically Available which states that the system guarantees availability
Soft State means the state of the system may change over time, even without input
Eventual Consistency states that the system will become consistent over a (very short) period of time unless other inputs are received.
What is replication?
Replication means to duplicate (make copies of, replicate) your database.
We had considered the benefits of having redundancy in a system to maintain high availability. Replication ensures redundancy in the database if one goes down. But it also raises the question of how to synchronize data across the replicas, since they’re meant to have the same data. Replication on write and update operations to a database can happen synchronously (at the same time as the changes to the main database) or asynchronously .
What is sharding?
Sharding data breaks your huge database into smaller databases. You can work out how you want to shard your data depending on its structure. It could be as simple as every 5 million rows are saved in a different shard, or go for other strategies that best fit your data, needs and locations served.
What is polling?
Polling is simply having your client check on a server by sending it a network request and asking for updated data. These requests are typically made at regular intervals like 5 seconds, 15 seconds, 1 minute or any other interval required by your use case.
What is pubsub messaging?
The key concept is that publishers ‘publish’ a message and a subscriber subscribes to messages. To give greater granularity, messages can belong to a certain “topic” which is like a category. These topics are like dedicated “channels” or pipes, where each pipe exclusives handles messages belonging to a specific topic. Subscribers choose which topic they want to subscribe to and get notified of messages in that topic. The advantage of this system is that the publisher and the subscriber can be completely de-coupled - i.e. they don’t need to know about each other. The publisher announces, and the subscriber listens for announcements for topics that it is on the lookout for.
A server is often the publisher of messages and there are usually several topics (channels) that get published to. The consumer of a specific topic subscribes to those topics. There is no direct communication between the server (publisher) and the subscriber (could be another server). The only interaction is between publisher and topic, and topic and subscriber.
Steps for system design interview
Step 1: Outline use cases, constraints, and assumptions
Step 2: Create a high level design
Step 3: Design core components
Step 4: Scale the design
What questions would you ask to outline the use case?
Who is going to use it?
How are they going to use it?
How many users are there?
What does the system do?
What are the inputs and outputs of the system?
How much data do we expect to handle?
How many requests per second do we expect?
What is the expected read to write ratio?
What should you consider when scaling an application?
Load balancer
Horizontal scaling
Caching
Database sharding
SQL vs NoSQL - reasons for SQL
Structured data Strict schema Relational data Need for complex joins Transactions Clear patterns for scaling More established: developers, community, code, tools, etc Lookups by index are very fast
SQL vs NoSQL - reasons for NoSQL
Semi-structured data Dynamic or flexible schema Non-relational data No need for complex joins Store many TB (or PB) of data Very data intensive workload Very high throughput for IOPS
What are message queues?
Message queues receive, hold, and deliver messages. If an operation is too slow to perform inline, you can use a message queue with the following workflow:
An application publishes a job to the queue, then notifies the user of job status A worker picks up the job from the queue, processes it, then signals the job is complete
The user is not blocked and the job is processed in the background. During this time, the client might optionally do a small amount of processing to make it seem like the task has completed. For example, if posting a tweet, the tweet could be instantly posted to your timeline, but it could take some time before your tweet is actually delivered to all of your followers.
Redis is useful as a simple message broker but messages can be lost.
RabbitMQ is popular but requires you to adapt to the ‘AMQP’ protocol and manage your own nodes.
Amazon SQS is hosted but can have high latency and has the possibility of messages being delivered twice.