Data Storage & Organization Flashcards Preview

COSC 404 > Data Storage & Organization > Flashcards

Flashcards in Data Storage & Organization Deck (32):

What is transfer size?

the unit of memory that can be individually accessed, read and written


What is latency?

the time it takes for info to be delivered after the initial request is made


What is bandwidth?

The rate at which info can be delivered.


What is Processor cache?

Faster memory storing recently used data that reduces the average memory access time.


What is a Solid State Drive?

- uses flash memory for storage


What is RAID (Redundant Arrays of Independent Disks) ?

a disk organization technique that utilize a large number of inexpensive, mass-market disks to provide increased reliability, performance and storage.


What do RAIDs do?

store extra data incase of disk failure.

- Mirror or shadow

- duplicates entires disks on multiple disks


What is mean time to failure (MTTF)?

the average time the device is expected to run continuously without any failure.


Explain RAID Level 0
and what is its capacity?

Striping at the block level (non-redundant)

- used for high performance where data loss is not crucial (parallesism)

Capacity: N


Explain RAID Level 1 and what is its capacity?

Mirrored disks (redundancy)

- for apps that require redundancy (protection from disk failure)

Capacity: N/2


Explain RAID Level 2

Memory-Style-Error-Correcting-Codes with bit stripping


Explain RAID Level 5 and what is its capacity?

- offers both reliability & increased performance

Capacity: N-1


Explain RAID Level 6 and what is its capacity?

- offers extra redundancy compared to Level 5

- used to deal with multiple drive failures

Capacity: N - X
X = # of parity drives such as 2)


What is a block-level interface

Allows the program to read & write a chunk of memory called a block (or page) from the device


What is a byte-level interface

allows the program to read & write individually addressable bytes from the device


What is a file-level interface

abstracts away the device addressable characteristics & provides a standard byte-level interface for files to programs running on the OS


Hierarchy of a database

Database is made up of files

- each file contains blocks

- each block contains records

- each record contains fields

- each field is a representation of a data item in a record


What does a record consist of?

One or more fields grouped together


What are the two main types of records? And what are they?

1. Variable-length records: the size of the record varies

2. Fixed-length records: all records have the same size


4 situations where variable formats are useful

1. The data doesn't have a regular structure in most cases

2. The data values are sparse in the records

3. There are repeating fields in the records

4. The data evolves quickly so schema evolution is challenging


3 disadvantages of variable formats

1. Waste space by repeating schema info for every record

2. Allocating variable-sized records efficiently is challenging

3. Query processing is more difficult and less efficient when the structure of the data varies


What are the 6 issues related to storing records in blocks? Describe each one.

1. Separation
- how do we separate adjacent records

2. Spanning
- can a record cross a block boundary?

3. Clustering
- can a block cross a block boundary?

4. Splitting
- are records allocated in multiple blocks?

5. Ordering
- are the records sorted in any way?

6. Addressing
- how do we reference a given record?


What are the 2 options when records do not fit in a block?

Describe them

- waste space at the end of the block

- Start a records at the end of a block and continue on the next


What is clustering?
(6 storing issues)

allocating records of different types together on the same block (or same file) cause they are frequently accessed together.


What is split record?
(6 storing issues)

a records where portions of the record are allocated on multiple blocks for reasons other than spanning


What is ordering records
(6 storing issues)

when the records in a file (block) are sorted based on the value of one or more fields


What are the 2 methods of ordering records, describe them.
(6 storing issues)

1. Physically ordered
- the records are allocated in blocks in sorted order

2. Logical ordered
- the records are not physical sorted, but each records contains a pointer to the next in the sorted order


What is addressing records?
(6 storing issues)

A method for defining a unique value or address to reference a particular record


What are the 2 methods of addressing records, describe them.
(6 storing issues)
(Think ordering records)

1. Physically addressed
- a record has a physical address based on the device where its located

2. Logically addressed
- A record has a key value or some other identifier that can be used to lookup its physical address


What is pointer swizzling?

the process for converting disk pointers to memory pointers and vice versa when blocks move between memory and disk


What is a buffer and a buffer manager?

A buffer is a portion of main memory available to store copies of disk blocks

A buffer manager is a subsystem responsible for allocating buffer space in main memory


What is buffer replacement strategy ?

determines which block should be removed from the buffer when space is required