Performance, Scaling & Trade-offs Flashcards

Master how NoSQL databases scale and perform under high load. "How do you scale MongoDB horizontally?" "What are the trade-offs between consistency and availability in Cassandra?" Read/write scaling strategies Sharding vs replication Consistency models per database type Caching strategies (Redis, DynamoDB DAX) Indexing and performance tuning Data modeling strategies (denormalization, access patterns) Write amplification and compaction (32 cards)

1
Q

How do you scale MongoDB horizontally?

A

MongoDB is scaled horizontally using sharding, which partitions data across multiple servers (shards) based on a shard key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the trade-offs between consistency and availability in Cassandra?

A

Cassandra favors availability and partition tolerance (AP in CAP theorem) and provides tunable consistency (e.g., QUORUM, ONE, ALL), allowing flexibility in consistency vs latency trade-offs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is sharding in NoSQL databases?

A

Sharding is a horizontal scaling technique that splits data across multiple machines, improving read/write throughput and enabling storage of large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is replication in NoSQL databases?

A

Replication involves copying data across multiple servers to improve fault tolerance, availability, and read scalability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between sharding and replication?

A

Sharding splits and distributes different parts of data across nodes, while replication copies the same data to multiple nodes for redundancy and availability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are read/write scaling strategies in NoSQL?

A

Use replication for read scalability, sharding for write scalability, and caching to reduce database load and latency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is eventual consistency?

A

Eventual consistency means that given enough time, all nodes will converge to the same data, commonly used in NoSQL databases to improve availability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is strong consistency?

A

Strong consistency ensures that reads always return the most recent write, which may reduce availability or increase latency in distributed systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are consistency models in NoSQL databases?

A

NoSQL databases offer models like strong, eventual, causal, and tunable consistency depending on the use case and system requirements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Redis used for in performance tuning?

A

Redis is often used as a caching layer to reduce latency and database load by storing frequently accessed data in memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is DynamoDB DAX?

A

DynamoDB DAX (DynamoDB Accelerator) is a fully managed, in-memory cache for DynamoDB that improves read performance by reducing response times to microseconds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are indexing strategies in NoSQL databases?

A

Create indexes on frequently queried fields, avoid over-indexing, use compound indexes when needed, and monitor query patterns for optimization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is write amplification?

A

Write amplification refers to the phenomenon where more data is written to disk than originally intended, often due to compaction, logging, or replication.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is compaction in NoSQL databases?

A

Compaction is a process that merges and reorganizes data on disk to reduce storage usage and improve read performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is denormalization in NoSQL data modeling?

A

Denormalization involves duplicating data to optimize for read performance and access patterns, trading off some storage efficiency and consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is modeling by access pattern important in NoSQL?

A

Access pattern-based modeling ensures queries are efficient and avoid costly joins or deep scans, which are less supported in NoSQL.

17
Q

What are common data modeling strategies in NoSQL?

A

Use denormalization, embed related data, use partition keys wisely, and design based on how data will be queried.

18
Q

What is the impact of sharding on system design?

A

Sharding adds complexity in data partitioning and routing but allows massive scalability and distributed workload handling.

19
Q

What is the impact of replication on system design?

A

Replication improves read performance and availability but can complicate consistency and failover logic.

20
Q

What are best practices for scaling NoSQL databases?

A

Choose appropriate shard keys, use caching, monitor query patterns, optimize indexes, and avoid hotspots or large partitions.

21
Q

What are advantages of sharding?

A

Improves horizontal scalability, allows storage of large datasets, and enables high throughput writes across shards.

22
Q

What are disadvantages of sharding?

A

Can lead to uneven load distribution (hotspots), requires careful shard key design, and increases operational complexity.

23
Q

What is an example of using Redis as a cache?

A

Store user session data or product details in Redis to serve fast responses and reduce load on the primary database.

24
Q

What is a real-world use case of sharding?

A

Social media platforms shard user data by user ID to distribute load and scale horizontally.

25
What is a real-world use case of replication?
E-commerce systems replicate order data across regions for high availability and faster local reads.
26
How does indexing affect performance?
Indexes speed up read queries but add overhead on writes, so it's essential to balance based on query patterns.
27
What are architectural implications of using caching?
Caching improves performance but requires cache invalidation strategies and consistency management between cache and source of truth.
28
How do you monitor NoSQL performance?
Use tools like Prometheus, Grafana, CloudWatch, and database-native tools to track latency, throughput, replication lag, and storage.
29
How do you debug NoSQL performance issues?
Analyze slow queries, check index usage, monitor replication/shard health, and inspect logs or metrics for resource bottlenecks.
30
What are common trade-offs in NoSQL performance tuning?
Balancing consistency vs availability, optimizing read vs write performance, and managing storage vs speed through compaction and caching.
31
What are common interview questions about NoSQL performance and scaling?
'What is sharding?', 'How does Cassandra handle write scalability?', 'What is eventual consistency?', 'How do you optimize reads in MongoDB?'
32
What are potential gotchas when scaling NoSQL systems?
Poor shard key choice, uneven data distribution, missing indexes, cache inconsistency, write amplification, and overly large partitions.