Systems Flashcards

Question

Performance vs scalability

Answer 1

A service is scalable if it results in increased performance in a manner proportional to resources added. Generally, increasing performance means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow. Another way to look at performance vs scalability: If you have a performance problem, your system is slow for a single user. If you have a scalability problem, your system is fast for a single user but slow under heavy load.

Answer 2

Latency is the time to perform some action or to produce some result. Throughput is the number of such actions or results per unit of time. Generally, you should aim for maximal throughput with acceptable latency.

Answer 3

In a distributed computer system, you can only support two of the following guarantees: Consistency - Every read receives the most recent write or an error Availability - Every request receives a response, without guarantee that it contains the most recent version of the information Partition Tolerance - The system continues to operate despite arbitrary partitioning due to network failures Networks aren't reliable, so you'll need to support partition tolerance. You'll need to make a software tradeoff between consistency and availability. CP - consistency and partition tolerance Waiting for a response from the partitioned node might result in a timeout error. CP is a good choice if your business needs require atomic reads and writes. AP - availability and partition tolerance Responses return the most readily available version of the data available on any node, which might not be the latest. Writes might take some time to propagate when the partition is resolved. AP is a good choice if the business needs allow for eventual consistency or when the system needs to continue working despite external errors.

Answer 4

Increased security - Hide information about backend servers, blacklist IPs, limit number of connections per client Increased scalability and flexibility - Clients only see the reverse proxy's IP, allowing you to scale servers or change their configuration SSL termination - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations. Removes the need to install X.509 certificates on each server Compression - Compress server responses Caching - Return the response for cached requests Static content - Serve static content directly

Answer 5

A type of database transaction that has four important properties: Atomicity: The operations that constitute the transaction will either all succeed or all fail. There is no in-between state. Consistency: The transaction cannot bring the database to an invalid state. After the transaction is committed or rolled back, the rules for each record will still apply, and all future transactions will see the effect of the transaction. Also named Strong Consistency. Isolation: The execution of multiple transactions concurrently will have the same effect as if they had been executed sequentially. Durability: Any committed transaction is written to non-volatile storage. It will not be undone by a crash, power loss, or network partition.

Answer 6

Also known as public-key encryption, asymmetric encryption relies on two keys—a public key and a private key—to encrypt and decrypt data. The keys are generated using cryptographic algorithms and are mathematically connected such that data encrypted with the public key can only be decrypted with the private key. While the private key must be kept secure to maintain the fidelity of this encryption paradigm, the public key can be openly shared. Asymmetric-key algorithms tend to be slower than their symmetric counterparts.

Answer 7

Widely used kind of storage, in small and large scale systems. They don’t really count as databases per se, partially because they only allow the user to store and retrieve data based on the name of the blob. This is sort of like a key-value store but usually blob stores have different guarantees. They might be slower than KV stores but values can be megabytes large (or sometimes gigabytes large). Usually people use this to store things like large binaries, database snapshots, or images and other static assets that a website might have. Blob storage is rather complicated to have on premise, and only giant companies like Google and Amazon have infrastructure that supports it. So usually in the context of System Design interviews you can assume that you will be able to use GCS or S3. These are blob storage services hosted by Google and Amazon respectively, that cost money depending on how much storage you use and how often you store and retrieve blobs from that storage.

Answer 8

A piece of hardware or software that stores data, typically meant to retrieve that data faster than otherwise. Caches are often used to store responses to network requests as well as results of computationally-long operations. Note that data in a cache can become stale if the main source of truth for that data (i.e., the main database behind the cache) gets updated and the cache doesn't.

Answer 9

The policy by which values get evicted or removed from a cache. Popular cache eviction policies include LRU (least-recently used), FIFO (first in first out), and LFU (least-frequently used).

Answer 10

The paradigm by which modern systems are designed, which consists of clients requesting data or service from servers and servers providing data or service to clients.

Answer 11

A type of hashing that minimizes the number of keys that need to be remapped when a hash table gets resized. It's often used by load balancers to distribute traffic to servers; it minimizes the number of requests that get forwarded to different servers when new servers are added or when existing servers are brought down.

Answer 12

A CDN is a third-party service that acts like a cache for your servers. Sometimes, web applications can be slow for users in a particular region if your servers are located only in another region. A CDN has servers all around the world, meaning that the latency to a CDN's servers will almost always be far better than the latency to your servers. A CDN's servers are often referred to as PoPs (Points of Presence). Two of the most popular CDNs are Cloudflare and Google Cloud CDN.

Answer 13

A special auxiliary data structure that allows your database to perform certain queries much faster. Indexes can typically only exist to reference structured data, like data stored in relational databases. In practice, you create an index on one or multiple columns in your database to greatly speed up read queries that you run very often, with the downside of slightly longer writes to your database, since writes have to also take place in the relevant index.

Answer 14

In a relational database that provides ACID transactions, updating rows inside a table will cause a lock to be held on that table or on the rows you are updating. If a second transaction tries to update the same rows, it will block before the update until the first transaction releases that lock. This is one of the core mechanisms behind the Atomicity of ACID transactions.

Answer 15

A Distributed File System is an abstraction over a (usually large) cluster of machines that allows them to act like one large file system. The two most popular implementations of a DFS are the Google File System (GFS) and the Hadoop Distributed File System (HDFS). Typically, DFSs take care of the classic availability and replication guarantees that can be tricky to obtain in a distributed-system setting. The overarching idea is that files are split into chunks of a certain size (4MB or 64MB, for instance), and those chunks are sharded across a large cluster of machines. A central control plane is in charge of deciding where each chunk resides, routing reads to the right nodes, and handling communication between machines. Different DFS implementations have slightly different APIs and semantics, but they achieve the same common goal: extremely large-scale persistent storage

Answer 16

An operation that has the same ultimate outcome regardless of how many times it's performed. If an operation can be performed multiple times without changing its overall effect, it's idempotent. Operations performed through a Pub/Sub messaging system typically have to be idempotent, since Pub/Sub systems tend to allow the same messages to be consumed multiple times. For example, increasing an integer value in a database is not an idempotent operation, since repeating this operation will not have the same effect as if it had been performed only once. Conversly, setting a value to "COMPLETE" is an idempotent operation, since repeating this operation will always yield the same result: the value will be "COMPLETE".

Answer 17

A popular framework for processing very large datasets in a distributed setting efficiently, quickly, and in a fault-tolerant manner. A MapReduce job is comprised of 3 main steps: the Map step, which runs a map function on the various chunks of the dataset and transforms these chunks into intermediate key-value pairs. the Shuffle step, which reorganizes the intermediate key-value pairs such that pairs of the same key are routed to the same machine in the final step. the Reduce step, which runs a reduce function on the newly shuffled key-value pairs and transforms them into more meaningful data. The canonical example of a MapReduce use case is counting the number of occurrences of words in a large text file. When dealing with a MapReduce library, engineers and/or systems administrators only need to worry about the map and reduce functions, as well as their inputs and outputs. All other concerns, including the parallelization of tasks and the fault-tolerance of the MapReduce job, are abstracted away and taken care of by the MapReduce implementation.

Answer 18

Often shortened as Pub/Sub, the Publish/Subscribe pattern is a popular messaging model that consists of publishers and subscribers. Publishers publish messages to special topics (sometimes called channels) without caring about or even knowing who will read those messages, and subscribers subscribe to topics and read messages coming through those topics. Pub/Sub systems often come with very powerful guarantees like at-least-once delivery, persistent storage, ordering of messages, and replayability of messages.

Answer 19

An in-memory key-value store. Does offer some persistent storage options but is typically used as a really fast, best-effort caching solution. Redis is also often used to implement rate limiting.

Answer 20

The act of duplicating the data from one database server to others. This is sometimes used to increase the redundancy of your system and tolerate regional failures for instance. Other times you can use replication to move data closer to your clients, thus decreasing the latency of accessing specific data.

Answer 21

Sometimes called data partitioning, sharding is the act of splitting a database into two or more pieces called shards and is typically done to increase the throughput of your database. Popular sharding strategies include: Sharding based on a client's region Sharding based on the type of data being stored (e.g: user data gets stored in one shard, payments data gets stored in another shard) Sharding based on the hash of a column (only for structured data)

Answer 22

A server is usually called "stateless" if it does not require state to be persisted to disk in order to run successfully. Although many server process typically hold some state in memory including caching layers for instance, this typically means that we can run the server process the same way on any machine, and move it around whenever we want. This contrasts with Stateful processes.

Answer 23

The process through which a client and a server communicating over HTTPS exchange encryption-related information and establish a secure communication. The typical steps in a TLS handshake are roughly as follows: The client sends a client hello—a string of random bytes—to the server. The server responds with a server hello—another string of random bytes—as well as its SSL certificate, which contains its public key. The client verifies that the certificate was issued by a certificate authority and sends a premaster secret—yet another string of random bytes, this time encrypted with the server's public key—to the server. The client and the server use the client hello, the server hello, and the premaster secret to then generate the same symmetric-encryption session keys, to be used to encrypt and decrypt all data communicated during the remainder of the connection.

Answer 24

A distributed messaging system created by LinkedIn. Very useful when using the streaming paradigm as opposed to polling.

Systems Flashcards

(48 cards)