3. Relational Algebra & Query Optimization Flashcards
(5 cards)
What is Relational Algebra?
A formal language for manipulating relations, using operators that take relations as input and return relations as output
What are the Primitive Operators (e.g. set)?
◦ Union (∪): R ∪ S contains all tuples in R or S (duplicates removed). Operands must be union/type compatible (same heading).
◦ Intersection (∩): R ∩ S contains tuples in both R and S. Operands must be union/type compatible.
◦ Difference (−): R − S contains tuples in R but not in S. Operands must be union/type compatible. Difference is not commutative.
◦ Cartesian Product (×): R × S combines every tuple of R with every tuple of S. R and S don’t need to be compatible. Often used with other operators.
What are the Special Relational Operators?
◦ Restrict (σ): σ⟨condition⟩(R) returns tuples from R satisfying the Boolean condition. It’s a horizontal slice. Condition involves attributes, constants, comparison operators (<, ≤, >, ≥, =, ≠), and logical operators (∧, ∨, ¬). Applied per tuple. Commutative. Cascades can be merged with AND. Equivalent to SQL’s WHERE clause.
◦ Project (π): π{attributeset}(R) returns a relation with only the specified attributes. It’s a vertical slice. Duplicate tuples are removed. Not commutative. Equivalent to SQL’s SELECT list.
◦ Rename (ρ): ρS(R), ρ{b1,…}(R{a1,…}), or ρS{b1,…}(R{a1,…}) changes relation or attribute names. Useful for disambiguation or combining operations.
◦ Join (⨝): R ⨝⟨condition⟩ S combines tuples from R and S satisfying the condition. Derived from Cartesian Product and Restrict.
◦ Natural Join (⨝): R ⨝ S is an equijoin over all common attributes. Common attributes must have same name and compatible domains. Only one instance of common attributes appears in result. Can be dangerous if unrelated attributes share the same name.
◦ Outer Join: Retains tuples from one or both relations even if they don’t satisfy the join condition, padding with nulls. Types: LEFT, RIGHT, FULL. Contrasts with INNER join (default).
What is a Semijoin (⋉)?
R ⋉⟨condition⟩ S returns tuples from R only that satisfy the join condition. Equivalent to π{R}(R ⨝ S). Useful in distributed databases to reduce data transfer by sending only matching join attribute values from one side
What is query optimization?
Relational algebra expressions can be represented as trees. Restructuring the tree (reordering operations) can improve performance by producing smaller intermediate results. Restrict operations are generally more efficient when applied before join operations because they reduce the number of tuples involved in the join.