Week 9 - Physical design Flashcards

1
Q

What is the goal of the Conceptual Design phase in database design?

A

To study the problem domain and identify the entities and relationships involved, creating an abstract model of the data without worrying about how it will be stored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What happens during the Logical Design phase?

A

Entities and relationships are mapped to tables, and normalization is applied to remove redundancy and ensure data integrity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the focus of the Physical Design phase?

A

To optimize performance by adjusting table design for storage on disk, choosing indexes, and fine-tuning how data will be physically stored and accessed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the purpose of Physical Design in database systems?

A

The goal is to improve the efficiency of the system while maintaining correctness.
Physical design (or database tuning) focuses on:

How data is stored on disk

Choosing indexes, partitioning, and access paths

Making performance-enhancing decisions based on expected usage

It doesn’t involve building a DBMS from scratch but helps you understand what problems are addressed and why tuning techniques work.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Where are database records stored during implementation?

A

On permanent storage media like hard disks, in special files managed by the DBMS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does a DBMS access data from storage?

A

It loads data in pages (e.g., 2KB to 32KB) as needed, rather than reading files sequentially.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is disk access a concern in physical design?

A

Because disk access is much slower than main memory or CPU, so minimizing disk I/O is crucial for performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a buffer pool in a DBMS?

A

It’s a part of main memory used to store pages temporarily, allowing faster access to recently used data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens to a page after it’s fetched and processed?

A

It’s kept in the buffer pool instead of being discarded immediately, improving performance if it’s needed again soon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can the buffer pool store the entire database?

A

No — the buffer pool is limited, so only a portion of the database can be held in memory at a time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why should the number of pages used to store a table be minimized?

A

Because fewer pages mean less disk I/O, which improves performance by reducing the time spent reading from storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why is scanning every record of a large table expensive?

A

Because it requires loading many pages from disk, which is slow and resource-intensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the selection operation (σC), and how is it typically evaluated?

A

The selection operation picks rows based on a condition C. Typically, it requires scanning every row to check if the condition applies. If the condition is simple, the DBMS may know exactly which pages contain the relevant rows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can the selection operation σC be optimized?

A

If there’s an index on the attribute(s) in the condition, the DBMS can locate only the relevant pages, avoiding the need to scan every row.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the natural join (T₁ ⋈ T₂) operation, and how is it typically executed?

A

The natural join combines rows from T₁ and T₂ that agree on the common attributes. It usually involves comparing every row of T₁ with every row of T₂, which can be inefficient for large tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can the natural join (T₁ ⋈ T₂) be optimized to avoid unnecessary comparisons?

A

The goal is to reduce comparisons by focusing only on rows that are likely to match, using techniques such as hashing or sorting to speed up the join process.

17
Q

Why must the intermediate result of a query be “reasonably small” for optimization to be effective?

A

Because many optimizations (like pushing selections or reordering joins) rely on operating over smaller tables—large intermediate results negate their benefit.

18
Q

Why are joins considered an expensive operation in query evaluation?

A

Because a naive join compares rows of one table with rows of another, leading to potentially huge numbers of comparisons and disk I/O.

19
Q

What guideline helps keep join costs under control with respect to result size?

A

Ensure that the size of the join result does not exceed the size of the input tables by a large factor—smaller results are faster to process.

20
Q

How can a DBMS be “guided” to find matching rows quickly and avoid non-matches?

A

By providing indexes, join hints, or up-to-date statistics so the optimizer knows where to look and which join algorithms to choose.