10. Database Files and Storage Flashcards

(7 cards)

1
Q

Storage Hierarchy:

A

Multiple levels of storage with trade-offs in speed, cost, and capacity: Primary (CPU access - cache, RAM), Secondary (Disk - HDDs, SSDs), Tertiary (Optical, Tape). Databases rely heavily on secondary storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Blocks:

A

Data is transferred between disk and memory in fixed-size units called blocks (or pages). It is more efficient to transfer data in blocks than individual records due to disk access time components (seek time, rotational latency).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Buffering:

A

Reserving areas in main memory (buffers) to hold disk blocks, allowing parallelisation of I/O and CPU processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Records (Collections of related data values):

A

◦ Fixed Length: All records have the same size.
◦ Variable Length: Records have different sizes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Spanned vs Unspanned Records:

A

◦ Unspanned: A record cannot cross block boundaries. Simple to implement, but can waste space if records don’t fit neatly into blocks.
◦ Spanned: A record can be split across multiple blocks. More efficient use of block space, especially for large or variable-length records, but more complex to implement and access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Record Blocking:

A

The blocking factor (bfr) is the number of records stored per block. For fixed-length, bfr = floor(Block Size / Record Size).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

File Organization (How records are arranged in a file on disk):

A

◦ Unordered Records (Heap Files): Records are inserted at the end. Efficient for insertion. Slow for search (linear scan). Deletion leaves fragmented space. Sorting requires creating a new copy.
◦ Ordered Records (Sorted Files): Records are sorted based on a key field. Efficient for sequential reading and range searches on the sort key. Binary search on the sort key is O(log2 b) disk accesses. Expensive for insertion/deletion/update of sort key values. No help for searching on non-sort key fields.
◦ Hashed Records (Hash Files): Records are placed based on a hash function applied to a key field. Extremely fast access (direct address) for exact matches on the hash key. Easier insertion/deletion than sorted files. Collision resolution is needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly