System Design Flashcards
(86 cards)
what are the two types of requirements you want to collect at the start of a system design interview?
functional and non-functional requirements
what are functional requirements?
Specific functionalities that bring the user to the service. These specific functions are directly adjacent to providing the key service
what are non-functional requirements?
Functionalities that affect the overall operation of the system but are broader system functions that aren’t adjacent to specific functions of a service. Examples include scalability, reliability, usability, security, and performance
why is it important that application servers are stateless?
so that any server can handle any request for load balancing and failover and resilience
is mongodb a sql or nosql db?
nosql
in mongodb, what happens when a primary shard node goes down?
the secondary nodes will automatically elect a new leader
in mongodb, how does the mongos service know which shard a data request belongs to?
look up in the config service
how does cassandra trade of consistency to get more availability?
because any node can function as the primary read or write node, you are more available, but it might take time for the data to propagate to the other nodes so you have consistency
what is the hotspot problem in nosql databases?
aka the celebrity problem, if one shard gets a lot more traffic due to usage patters. many modern systems can reshard based on traffic patterns to avoid this
how can you design your data schema to make it easy to scale horizontally using nosql?
think in terms of simple key-value lookups. maybe break complex joins into a couple of simple key-value lookups. thats much easier to shard
what does it mean to say that data is normalized?
data is stored in logical tables based on entity and refer to each other via foreign keys. data is not duplicated all over the place. for exampe, a dinner reservation has a customer_id, which points to the customer table to a row with that id
what are the advantages of having normalised data?
there is less (or no) data duplication, saving space, and you can update data in a single place and have that reflected everywhere
why might you choose to denormalise your data?
it’s more efficient and performant. you get all the data you need with a single lookup, instead of having to do multiple lookups to join all the data. the cost you pay is data duplication
what are the downsides of demormalised data?
it costs extra space due to data duplication, and updating data requires you to update in multiple locations which costs time and eventual consistency
can you have normalised data in a nosql database?
yes, it just means that you would need to do multiple simple lookups. for example, looking up a reservation, and then looking up the customer_id in that reservation to the get the customer data
would you generally start with normalised or denormalised data? when might you switch
normalised, because it’s simpler, saves space, and is more consistent. when performance becomes a bottleneck, you might denormalise to need to do fewer lookups and speed things up.
if you design an LRU cache, what two data structures might you use under the hood?
a hashmap for finding a kv-pair in constant time, as well as a doubly-linked-list to move read item to the front of the list, so you can always delete the last item in the list.
what are three common cache eviction strategies?
LRU, LFU, FIFO
what is LRU?
least recently used - a cache eviction strategy
what is LFU?
least frequently used - a cache eviction strategy
what are some popular caching solutions?
memcached, redis, elasticache, ncache, ehcache
what is memchached
simple in-memory key value store. nothing fancy like redis has.
what does CDN stand for?
Content Delivery Network
dns based geo-routing of traffic to the correct region