Database-Chapter 7 Flashcards
Normalization (17 cards)
Normalization
Process for evaluating and correcting table structures to minimize data redundancies thereby reducing the likelihood of data anomalies
Characteristics of a relational table
-A table is perceived as a 2D structure composed of rows and columns
-Each row(tuple) represents a single entity and must be distinct
-Each table column represents an attribute , and each column has a distinct name
-Each cell should contain an atomic value-single cell value
-All values in a column must conform to the same data format
The need for normalization
-The table structure appears to work because the report can be generated with ease
-Unfortunately, report may yield different results depending on what data anomaly has occurred
-Unnecessary repetition of data redundancy
-Ease of making spelling mistakes in multiple occurrences of the same data
Objectives of normalization
1)Each table should represent a single entity
2)No data item is to be unnecessarily stored in more than one table(redundancy)
3)All attributes in a table should be dependent on the PK
Normal forms
1NF:No repeating groups and PK identified
2NF: 1NF and no partial dependencies
3NF: 2NF and no transitive dependencies
Boyce-Codd NF: Every determinant is a candidate key
4NF: 3NF and no independent multivalued dependencies
Partial dependency
Attributes which are only dependent on part of the composite PK
Transitive dependency
Attribute is dependent on any other attribute except PK, they still yield data anomalies
Repeating group
Is an attribute ,or group of attributes, within a table that occurs with multiple values for a single occurrence of the nominated key attribute(s) for that table
Conversion to 1NF
Eliminate the repeating groups
Identify the PK
Identify all dependencies
Conversion to 2NF
Write each PK component on a separate line
Assign corresponding dependent attributes
Conversion to 3NF
Identify each new determinant
Identify the dependent attributes
Remove the dependent attributes from the transitive dependencies
Surrogate key
Is an artificial PK introduced by the designer with the purpose of simplifying the assignment of PKs to tables ,surrogate keys are numeric, they are automatically generated by the DBMS ,free of semantic content(they have no special meaning) and are usually hidden from the end users
Surrogate key considerations
When PK is considered to be unsuitable, designers use surrogate keys
Purpose of improving the design
Is to improve the databases’ ability to provide info and on enhancing its operational characteristics, it should also consider end-user demand for fast performance, hence why we denormalize
Improving the design
-Table structures are cleaned up to eliminate troublesome initial partial and transitive dependencies
-Normalization cannot, by itself, be relied on to make good designs
-It is valuable because its use helps eliminate data redundancies
-Issues to address in order to produce good normalized set of tables:
-Evaluate PK assignments
-Evaluate naming conventions
-Refine attribute atomicity
-Identify new attributes
-Identify new relationships
Refine PKs as required for data granularity(level of detail represented by the values stored in a table’s row. Lowest level of granularity is atomic data
-Maintain historical accuracy
-Evaluate using derived attributes
Denormalization
The number of database table expands if tables decomposed to conform to nomarlization requirements
Unnormalized tables in production database tend to suffer from these defects:
-Data updates are less efficient because programs that read and update tables must deal with larger tables
-Indexing is more cumbersome
-Unnormalized tables yield no simple strategies for creating virtual tables known as views
Tables are sometimes denormalized to yield less I/O which increases processing speed