Performance, Scaling & Trade-offs Flashcards
Master how NoSQL databases scale and perform under high load. "How do you scale MongoDB horizontally?" "What are the trade-offs between consistency and availability in Cassandra?" Read/write scaling strategies Sharding vs replication Consistency models per database type Caching strategies (Redis, DynamoDB DAX) Indexing and performance tuning Data modeling strategies (denormalization, access patterns) Write amplification and compaction (32 cards)
How do you scale MongoDB horizontally?
MongoDB is scaled horizontally using sharding, which partitions data across multiple servers (shards) based on a shard key.
What are the trade-offs between consistency and availability in Cassandra?
Cassandra favors availability and partition tolerance (AP in CAP theorem) and provides tunable consistency (e.g., QUORUM, ONE, ALL), allowing flexibility in consistency vs latency trade-offs.
What is sharding in NoSQL databases?
Sharding is a horizontal scaling technique that splits data across multiple machines, improving read/write throughput and enabling storage of large datasets.
What is replication in NoSQL databases?
Replication involves copying data across multiple servers to improve fault tolerance, availability, and read scalability.
What is the difference between sharding and replication?
Sharding splits and distributes different parts of data across nodes, while replication copies the same data to multiple nodes for redundancy and availability.
What are read/write scaling strategies in NoSQL?
Use replication for read scalability, sharding for write scalability, and caching to reduce database load and latency.
What is eventual consistency?
Eventual consistency means that given enough time, all nodes will converge to the same data, commonly used in NoSQL databases to improve availability.
What is strong consistency?
Strong consistency ensures that reads always return the most recent write, which may reduce availability or increase latency in distributed systems.
What are consistency models in NoSQL databases?
NoSQL databases offer models like strong, eventual, causal, and tunable consistency depending on the use case and system requirements.
What is Redis used for in performance tuning?
Redis is often used as a caching layer to reduce latency and database load by storing frequently accessed data in memory.
What is DynamoDB DAX?
DynamoDB DAX (DynamoDB Accelerator) is a fully managed, in-memory cache for DynamoDB that improves read performance by reducing response times to microseconds.
What are indexing strategies in NoSQL databases?
Create indexes on frequently queried fields, avoid over-indexing, use compound indexes when needed, and monitor query patterns for optimization.
What is write amplification?
Write amplification refers to the phenomenon where more data is written to disk than originally intended, often due to compaction, logging, or replication.
What is compaction in NoSQL databases?
Compaction is a process that merges and reorganizes data on disk to reduce storage usage and improve read performance.
What is denormalization in NoSQL data modeling?
Denormalization involves duplicating data to optimize for read performance and access patterns, trading off some storage efficiency and consistency.
Why is modeling by access pattern important in NoSQL?
Access pattern-based modeling ensures queries are efficient and avoid costly joins or deep scans, which are less supported in NoSQL.
What are common data modeling strategies in NoSQL?
Use denormalization, embed related data, use partition keys wisely, and design based on how data will be queried.
What is the impact of sharding on system design?
Sharding adds complexity in data partitioning and routing but allows massive scalability and distributed workload handling.
What is the impact of replication on system design?
Replication improves read performance and availability but can complicate consistency and failover logic.
What are best practices for scaling NoSQL databases?
Choose appropriate shard keys, use caching, monitor query patterns, optimize indexes, and avoid hotspots or large partitions.
What are advantages of sharding?
Improves horizontal scalability, allows storage of large datasets, and enables high throughput writes across shards.
What are disadvantages of sharding?
Can lead to uneven load distribution (hotspots), requires careful shard key design, and increases operational complexity.
What is an example of using Redis as a cache?
Store user session data or product details in Redis to serve fast responses and reduce load on the primary database.
What is a real-world use case of sharding?
Social media platforms shard user data by user ID to distribute load and scale horizontally.