What is database normalization?
Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency.
What is a CROSS JOIN?
This join performs a Cartesian Product of the two input tables. In other words, it performs a multiplication between the tables yielding a row for each combination of rows from both sides. If you have m rows in table T1 and n rows in table T2, the result of T1 CROSS JOIN T2 is m * n rows. CROSS JOIN doesn’t use the ON clause.
In a self join, is table aliasing mandatory?
Yes. If you don’t assign different aliases to the two instances of the table, you end up with an invalid result because there are duplicate column names. By aliasing the tables, you can refer to the columns in an unambiguous way, e.g. D.n vs S.n.
What is the older (equivalent) syntax for CROSS JOIN?
T1 CROSS JOIN T2 is equivalent to FROM T1, T2. Standard SQL and T-SQL support both. However, it is recommended to stick to the newer syntax.
What is an INNER JOIN?
The inner join returns only matching rows for which the predicate evaluates to true. Rows from either side for which the predicate evaluates to false or unknown are discarded.
For INNER JOINs, what is the difference between the ON and the WHERE clauses, and does it matter if you specify your predicate in one or the other?
For inner joins, it does not matter. Since the WHERE is evaluated right after the FROM, conceptually, it is equivalent to concatenating the predicates with an AND operator. However, it’s more intuitive to match columns from both sides in the ON clause and predicates that filter columns from only one side in the WHERE clause.
What is the older (equivalent) syntax for INNER JOIN?
Both SQL and T-SQL support the FROM T1, T2 comma syntax for INNER JOINs and then put all predicates in the WHERE clause. If you don’t include the WHERE clause with the join predicate, you could end up with an unintentional CROSS JOIN.
When using the newer syntax, is an INNER JOIN valid syntactically without an ON clause?
What happens when you leave off the “INNER” portion of an INNER JOIN?
Since INNER JOIN is the most commonly used type of join, the standard decided to make it the default in case you specify just the JOIN keyword. T1 JOIN T2 is equivalent to T1 INNER JOIN T2.
What is an outer join?
With outer joins, you can request to preserve all rows from one or both sides of the join regardless if there are matching rows in the other side.
What is a LEFT OUTER JOIN?
A LEFT OUTER JOIN (or LEFT JOIN for short) preserves the left table. The join returns what an inner join would normally return, i.e. matches (inner rows). In addition, the join also returns rows from the left table that had no matches in the right table (outer rows) with NULLs used as placeholders in the right side.
For outer joins, what is the difference between the ON and the WHERE clauses, and does it matter if you specify your predicate in one or the other?
(See pg 109) For outer joins, the ON and WHERE clauses play different roles and they are not interchangeable. The WHERE clause still plays a simple filtering role; however, the ON clause is more a matching role. A row in the preserved side will be returned whether the ON predicate finds a match for it or not. The ON is designed to find matches on the non preserved side. In other words, ON is not final with respect to the preserved side of the join. WHERE is final.
What is a RIGHT OUTER JOIN?
A RIGHT OUTER JOIN (or RIGHT JOIN for short) is similar to a LEFT OUTER JOIN except it preserves the right table and returns NULLs for non matches on the left.
What is a FULL OUTER JOIN?
A FULL OUTER JOIN or FULL JOIN preserves both sides. It returns the inner rows that are normally returned from an INNER JOIN plus rows from the left that don’t have matches in the right plus rows from the right that don’t have matches in the left- nulls as placeholders on both sides where matches are not found.
How are multi joins handles in T-SQL?
A join in T-SQL takes place conceptually between two tables at a time. A multi-join query evaluates the joins conceptually from left to right. The result of one join is used as the left input to the next join. For INNER JOINs and CROSS JOINs, order is not important; however when outer joins are involved, you need to be careful.
How can you handle the logical bugs that come up with multi-joins?
You can separate some of the joins to their own independent logical phase using parenthesis: SELECT .. FROM Production.Suppliers AS S LEFT OUTER JOIN (Production.Products AS P INNER JOIN Production.Categories AS C ON C.categoryid = P.category.id) ON S.suppplierid = P.supplierid.
What are the shortened forms for INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN?
JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
What is a self-contained subquery?
A subquery that has no dependency on the outer query. If you want, you can highlight the inner query and run it independently.
What are the different types of results a subquery can return?
A subquery can return a single value, multiple values, or even an entire table result.
What happens when a scalar subquery returns more than one value?
The code fails at runtime.
What happens when a scalar subquery returns an empty set?
The empty set is converted to NULL.
What’s an example of how a scalar subquery can be used?
It can be used where a single value expression is expected like in a one side comparison. WHERE unitprice = (SELECT MIN(unitprice) FROM Production.Products);
What’s an example of how a multi-valued subquery can be used?
It can be used where a multi-valued result is expected such as when using the IN or NOT IN predicate, e.g. WHERE supplierid IN (SELECT supplierid FROM Production.Suppliers WHERE country=’Japan’)
What are the two types of subqueries?
Self contained and correlated.
What is a correlated subquery?
A subquery where the inner query has a reference to a column from the table in the outer query. Unlike the self-contained subquery, you can’t simply highlight the inner portion and run it independently.
What is the EXISTS predicate?
The EXISTS predicate accepts a subquery as input and returns true when the subquery returns at least one row and false otherwise. Because it only returns true or false, the SQL Server Query Optimizer ignores the SELECT list of the subquery. Whatever you specify there will not affect optimization choices like index selection.
What is a table expression?
A table expression is a named query. It has three main parts: You write an (1) inner query, (2) name it, and (3) query it from an outer query. T-SQL supports 4 forms of table expressions (1) derived tables, (2) common table expressions (CTEs), (3) views, (4) inline table-valued functions.
What is a derived table?
A named table expression that is visible only to the statement that defines it. A derived table’s inner query is defined in parenthesis in the FROM clause of the outer query. The name is specified after the parenthesis. It has the form FROM () AS Name;
What is a common table expression (CTE)?
A named table expression that is visible only to the statement that defines it. Has the following form: WITH AS () ;
What is a view?
A reusable table expression whose definition is stored as an object in the database. Since it’s an object in the database, you can control permissions. Does not accept parameters.
What is an inline table-valued function?
A reusable table expression whose definition is stored as an object in the database. Since it’s an object in the database, you can control permissions. Accepts parameters. CREATE FUNCTION HR.GetManagers(@empid AS INT) RETURNS TABLE AS RETURN … GO
How is a table expression supposed to be relational?
The inner query needs to be relational. All columns returned by the inner query must have names (use aliases if the column is the result of an expression), the names must be unique, and the inner query is not allowed to have an ORDER BY clause. *Only TOP/OFFSET-FETCH options in the inner query are allowed to have an ORDER BY, but that’s functional and not for presentation.
Do table expressions have any additional performance side to them?
No. SQL Server un-nests the table’s inner query and therefore interacts with the underlying tables directly. It does not somehow persist the table expression’s result in an internal work table and then interact with that work table.
What column aliasing options are available to you when working with derived tables?
Inline and External. With inline, you specify the column alias as part of the expression as in AS alias. With external, you name all target columns right after the derived table’s name as in FROM (…) AS D(rownum, categoryid, productid, …)
What are some of the problems with derived tables?
(1) It’s difficult to refer to one derived table from another. In such as case, you end up nesting the derived tables. (2) The name assigned to a derived table is not visible to other elements that appear in the same logical query processing phase. As a result, you can’t join multiple instances of the same derived table.
Can multiple CTE’s be defined in the same statement?
Yes. Simply separate them by commas. Each CTE can refer to the previously defined CTE and the outer query can refer to all of them, e.g. WITH C1 AS (…), C2 AS (SELECT FROM C1) SELECT FROM C2…
Can multiple instances of the same CTE be joined?
Yes. Because the CTE name is assigned before the start of the outer query, you can refer to multiple instances of the same CTE name.
Can CTE’s be recursive?
Yes. The body of the recursive CTE has two or more queries - usually separated by a UNION ALL operator: the anchor member and the recursive member which has a reference to the CTE’s name. The recursive member is invoked repeatedly until an empty result set is returned.
What is the APPLY operator?
Used to apply a table expression given to it as the right input to each row from a table expression given to it as the left input. The right table expression can be correlated to the left table. Can replace the use of cursors in some cases. There are two forms of the APPLY operator: (1) CROSS and (2) OUTER.
What is the CROSS APPLY operator?
Operates on left and right table expressions. The right table expression can be correlated to the left table. The right table expression is applied to each row from the left input. If the right table expression returns an empty set for a left row, the left row isn’t returned.
What is the OUTER APPLY operator?
The OUTER APPLY operator does what the CROSS APPLY operator does but also includes the result rows from the left side that get an empty set back from the right side. NULLs are used as placeholders for the result columns from the right side. In other words, the OUTER APPLY preserves the left side.
What is the difference between the APPLY and JOIN operators?
With a JOIN operator, both inputs represent static relations. With APPLY, the left side is a static relation, but the right side can be a table expression with correlations to elements from the left table.
What is the difference between CROSS APPLY and CROSS JOIN when there isn’t any correlation between the right and left tables?
There isn’t any difference when there’s no correlation between the left and right tables. CROSS APPLY applies all rows from T2 to each row from T1.
What are the set operators that T-SQL supports?
(1) UNION, (2) INTERSECT, (3) EXCEPT. It also supports one multi-set operator: UNION ALL.
What are the general guidelines that accompany set operators?
(1) Because complete rows are matched between result sets, the number of columns in the queries has to be the same and the column types of corresponding columns need to be compatible (or implicitly convertible). (2) Set operators consider two NULLs as equal for comparison purposes. (3) The individual queries are not allowed to have ORDER BY clauses. (4) The ORDER BY can be added to the result of the set operator. (5) The column names of result columns are determined by the first query.
What is the UNION operator?
The UNION set operator unifies the results of the two input queries. UNION has an implied DISTINCT property meaning that it does not return duplicate rows.
What is the UNION ALL operator?
UNION ALL unifies the results of two input queries similar to UNION, but it doesn’t try to eliminate duplicates.
If the sets you’re unifying are disjoint and there’s no potential for duplicates, which set operator should you use?
UNION and UNION ALL will return the same result; however, UNION ALL will yield better performance because it doesn’t try to eliminate duplicates. And we don’t need to eliminate duplicates in this case.
What is the INTERSECT operator?
The INTERSECT operator returns only distinct rows that are common to both sets. In other words, if a row appears at least once in the first set and at least once in ht second set, it will appear once in the result of the INTERSECT operator.
What is the EXCEPT operator?
The EXCEPT operator performs set difference. It returns distinct rows that appear in the first query but not the second. In other words, if a row appears at least once in the first query and zero times in the second, it’s returned once in the output.
For which set operators does order of the input queries matter?
With UNION and INTERSECT, the order of the input queries doesn’t matter. However, with EXCEPT, there’s different meaning to EXCEPT vs EXCEPT .
What is the order of precedence for set operators?
INTERSECT precedes UNION and EXCEPT. UNION and EXCEPT are considered equal. You can always force precedence by using parenthesis.
In what order is UNION INTERSECT evaluated?
First, the intersection between query 2 and query 3 and then a union between that result and query 1.
Which set operators remove duplicates from the result?
UNION, INTERSECT, and EXCEPT.