Normal Forms and Functional Dependency1 Flashcards
What is redundancy?
having multiple copies of the same data in the database
What is the effect of redundancy?
Anomaly: inconsistencies that can arise due to data changes in a database with
insertion, deletion, and update
What are types of anomaly?
- Insertion Anomaly
- Deletion Anomaly
- Update Anomaly
What is Insertion Anomaly?
If there are lot of data redundancy.. then When the insertion of a data record is not possible without adding some additional
unrelated data to the record is called insertion anomaly
What is Deletion Anomaly?
When deletion of a data record results in losing some unrelated information that
was stored as part of the record that was deleted from a table
What is Update Anomaly?
When a data is changed, which could involve many records having to be changed,
leading to the possibility of some changes being made incorrectly.
What is the cause of redundancy?
Dependency ⇒ Redundancy ⇒ Anomaly
How to remove or atleast minimize redundancy?
Decompose (partition) the relation into smaller relations
What is a good decomposition?
Good Decomposition ⇒ Minimization of Dependency
Is every decomposition good?
No. It needs to preserve information, honour the dependencies, be efficient etc.
◦ Various schemes of normalization ensure good decomposition
◦ Normalization ⇒ Good Decomposition
What is lossy decomposition?
If we cannot reconstruct the original
relation from the decomposed part relations, it is called a lossy decomposition. Loss doesn’t necessarily mean missing data.. even if there is additional data.. the decomposition is said to be lossy.
What is lossless decomposition?
Lossless Join Decomposition is a decomposition of a relation R into relations R1, R2
such that if we perform natural join of two smaller relations it will return the original
relation
1. R1 ∪ R2 = R, R1 ∩ R2 6= φ
2. ∀r ∈ R,r1 = uR1(r),r2 = uR2(r)
3. r1 ./ r2 = r (join)
In other words by lossless decomposition it becomes feasible to reconstruct the relation
R from decomposed tables R1 and R2 by using Joins
What is First Normal Form (1NF)?
A domain is atomic if its elements are considered to be indivisible units
A relational schema R is in First Normal Form (INF) if
◦ the domains of all attributes of R are atomic
◦ the value of each attribute contains only a single value from that domain
NO Composite, multivalued attributes
If an attribute is multivalued.. then split into 2 relations (one to many relationship)
What does atomicity mean?
Atomicity is actually a property of how the elements of the domain are used
◦ Strings would normally be considered indivisible
◦ Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127
◦ If the first two characters are extracted to find the department, the domain
of roll numbers is NOT atomic
What is a functional dependency?
The functional dependency or FD
α → β
holds on R if and only if for any legal relations r(R), whenever any two tuples t1 and t2
of r agree on the attributes α, they also agree on the attributes β. That is,
t1[α] = t2[α] ⇒ t1[β] = t2[β]
In Short, if A →B means if A then B
When is a Functional Dependency called Trivial?
- A functional dependency is trivial if it is satisfied by all instances of a relation
◦ Example:
. ID, name → ID
. name → name - In general, α → β is trivial if β ⊆ α.
What is a Super Key?
K is a superkey for relation schema R if and only if K → R
That is , K is super key if it uniquely identifies a row in the relation.
What is a candidate key?
K is a candidate key for R if and only if
◦ K → R and
◦ for no α ⊂ K, α → R
That is, Candidate key is minimal super key.
Candidate key is subset of a Super Key
How to find the candidate keys in a relation R?
start with all the attributes of the relation as super key.. remove attributes one by one which can be derived from other.. (i.e, maintaining closure). then , check with each and every proper subset to see if any of them has closure of R. If not then we have a candidate key. else we take the new attirbute which has closure to R and check its proper subsets. After arriving at a candidate key. we check the FDs to see if we could derive an attribute in candidate key from another FD. If so, we replace that attribute alone and check again. We do this for all attributes until no more replacements could be made.. then we would have all the candidate keys of the relation
What are Armstrong’s Axioms, which help us dervie new functional dependencies from existing functional dependencies?
Armstrong’s Axioms:
◦ Reflexivity: if β ⊆ α, then α → β
◦ Augmentation: if α → β, then γα → γβ
◦ Transitivity: if α → β and β → γ, then α → γ
* These axioms can be repeatedly applied to generate new FDs and added to F
What does Reflexivity axiom of Armstrong’s axiom derive?
◦ Reflexivity: if β ⊆ α, then α → β
Eg: (Student,Dept) -> Student
(Student, Dept) -> Dept
How to compute closure of a set of Functional Dependencies - F+
To compute the closure of a set of functional dependencies F:
F+ ← F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F+
for each pair of functional dependencies f1 and f2 in F+
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F+
until F+ does not change any further
What are some of additional derived rules from Armstrong’s Axioms
◦ Union: if α → β holds and α → γ holds, then α → βγ holds
◦ Decomposition: if α → βγ holds, then α → β holds and α → γ holds
◦ Pseudotransitivity: if α → β holds and γβ → δ holds, then αγ → δ holds
How to test for a superkey?
To test if α is a superkey, we compute α
+, and check if α+ contains all attributes
of R