Module 12 - Data Storage Design Flashcards

1
Q

What are the 2 basic types of data storage?

A

Files
Databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are files used for?

A

Optimizing a specific transaction:
Master files
look-up files
transaction files
audit files
history files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are databases for?

A

Databases are for storing a collection of files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a relational database?

A

A database that uses primary keys and foreign keys. It is a collection of tables.

SQL is used for CRUD

The DMBS ensures referential integrity i.e. all foreign key elements cannot be null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a multidimensional database?

A

Used in data warehousing (collecting data from a wide range of sources)

Allows facts to be aggregated quickly on various dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a NoSQL Database?

A

Think of firebase, collections -> documents

Wide column stores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

File strengths and weaknesses:

A

Good performance
Good for short term data storage
Redundant data; data must be updated using programs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an advantage and disadvantage of Legacy databases?

A

Very mature products
Outdated, not efficient, limited future

Ouch.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Relational database pros and cons

A

Leader in database market.
Uses SQL.
Lower level database with flexibility, the opposite of a no SQL database that is more abstracted.

Cannot handle complex data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Object databases pros and cons

A

Can handle complex data.
Limited market

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Multidimensional database pros and cons

A

Can provide business facts quickly.
Highly specialized use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

NoSQL database pros and cons

A

Good for huge varied datasets.
New in the market
Specialized use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Entities –> tables

A

Each row represents a single entity instance

Each column represents an attribute

All values in a column have the same data format.

Each column has a range of values knows as the attribute domain

The order of rows and columns do not matter

Each table must have a primary key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a primary key?

A

An identifier composed of one or more attributes.

A table can have multiple candidates for a primary key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

1:M Relationship using PKs and FKs

A

Put the PK of the 1: in the table of the :M as a foreign key on the far left

You can do this because every instance of the many will refer to the one and only entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a foreign key?

A

The primary key of an outside table to create a COMMON ATTRIBUTE (the FK).

17
Q

Referential Integrity

A

Every reference to an entity instance through an FK is technically valid, it can be either null or a valid value.

Null values should be avoided as much as possible

This can happen in relationships where there can be one or no entities.

18
Q

1:1 Relationship with PKs and FKs

A

We add the FK in the middle of the table where it makes sense

19
Q

M:N Relationship with PKs and FKs

A

Relationships are implemented through common attributes, what common attributes should we define for M:N?

We create an intersection or bridge entity, then we add the FKs to it on the far left.

I think this effectively creates two 1:M relationships

20
Q

Why should null values in a database be avoided?

A

Because they threaten data integrity, think of comp 1800 and missing images.

21
Q

What is the best way to optimize data storage for efficiency?

A

Normalization

22
Q

What is denormalization?

A

Adding back redundant attributes to avoid aggregation (combing two tables together to access values from multiple tables)

23
Q

What is indexing

A

Similar to a library catalogue, index keys POINT to table rows.

Indexing requires overhead, they need to be updated whenever data is inserted, deleted, or changed.