SQL Intro Flashcards
(31 cards)
What are the BIG SIX statements and the correct order?
- SELECT
- FROM
- WHERE
- GROUP BY
- HAVING
- ORDER BY
What command to select the database to work with?
use the “USE” command/statement
How do you add comments to the SQL script?
for single line comments use “–” or “#”
for multi line comment use “/*” at the beginning and end of comments
What are aliases and how do you use them?
an aliases allows you to assign a name to a custom column.
to use: count(customer_id) AS customer_count
What is the SELECT statement?
-this is the first statement and is required
-this is where you identify the columns to pull into your results
-separate columns with a comma
What is the FROM statement?
-this is the second statement and is required
-this is where you identify the table your query will select data from
-NOTE: you will need to use JOIN to use multiple tables
What is SELECT DISTINCT?
this is used, in place of SELECT, to return the distinct/unique values in those columns
**if you include mutiple columns, your result set will return all distinct combinations of values across those columns
What are the aggregate functions?
used with the GROUP BY function to provide group level summaries
Count(), Count Distinct(), MIN(), MAX(), AVG(), and SUM()
Note for Count() and Count Distinct() NULL values get skipped. Need to add Count(*) to count the empty/null values
What is the WHERE statement?
-this is an option statement that comes after FROM
-this statement specifies criteria for filtering the records of your result set
- you can use any of the logical operators to filter the results: = (equal), <> (not equal), > (greater than), < (less than), >= (greater than or equal to), <= (less than or equal to), BETWEEN, LIKE, and IN()
-if using a non numerical value such as a date or name, need to wrap that date in single quotes
-example code: WHERE category = ‘Sci-Fi’
What are the logical conditions you can use in your WHERE statement?
Yes, you can use the following:
-WHERE & AND
-WHERE & OR
-WHERE & IN
-WHERE & LIKE
this goes after the FROM statement
WHERE & AND. What is it and example code?
use AND to return results that satisfy both criteria
WHERE amount = 0.99
AND payment_date > ‘2006-01-01’
WHERE & OR. What is it and example code?
use OR to return records which satisfy any criteria
WHERE amount = 0.99
OR amount = 0.5
OR amount = 0.25
WHERE & IN. What is it and example code?
use this instead of OR to simplify the code. this will save time and make the code easier to understand
OR Logic:
WHERE amount > 5
OR customer_id = 10
OR customer_id = 15
OR customer_id = 25
IN:
WHERE amount > 5
OR customer_id IN (10,15,25)
this results in the same results!
WHERE & LIKE. What is it and example code?
LIKE allows you to use pattern matching instead of exact matching
You can use two wildcards:
% - multiple characters (ie “%erry” will return blueberry, Burberry, Jerry, Terry, etc)
- exactly one character (ie “_erry” will return Terry, Jerry, etc)
WHERE name LIKE ‘%erry’
IMPORTANT: with “_” you can specify the number of characters that should be a wildcard. for example, you can write “____ton” (4 underscores), and this will search all records for value/text ending in “ton” with 4 characters in front of it.
NOTE: Capitalization MATTERS!!
WHERE & NOT LIKE. What is it and example code?
NOT LIKE is used to filter out records where values match a pattern you provide. It uses the same wildcards and capitalization rules as LIKE
What is GROUP BY?
it is an optional statement that specifies how to group
allows us to perform segment analysis on the data
if you use an aggregate function in the SELECT statement, GROUP BY is required
example code:
GROUP BY
rating
Can you use multiple GROUP BYs?
Yes, you can. This allows you to create groups and sub-groups in your result set.
Just like specifying multiple multiple row or column labels in a Pivot Table
example code:
GROUP BY
rating,
rental_duration,
replacement_cost
What is HAVING and what is an example?
HAVING is an option statement, you can use with GROUP BY to limit your result set to groups which satisfy certain logical criteria
In other words, HAVING is a great way to limit your results to the most important groups for your business
this statement comes after GROUP BY but before ORDER BY
you use the same aggregate functions as WHERE
What is ORDER BY and what is an example?
this is an optional statement, and is the last statement of the big six.
ORDER BY specifies the order in which your query results are displayed
if you want the list to be descending you need to write DESC, if not, the default setting is ascending
Example Code:
FROM payment
ORDER BY amount DESC
Note: you can use ORDER BY with multiple criteria. ie.
FROM payment
ORDER BY amount DESC, created_date DESC
**the list will be sorted on amount first, then created_date
What is the LIMIT statement
this statement is written at the end. by default mysql will limit the list to a 1000 records but if you use the limit statement you can set the total records returned.
Example code:
LIMIT 5000
this will return 5000 records
What is the CASE statement?
the CASE statement allow you to process a series of IF/THEN logical operators in a specific order.
it goes in the SELECT section
the logic is WHEN XX = YY, THEN ‘comment’
need to add END AS “columnName” at the end of the string
include an ELSE condition at the end as a catch all and write something like “Oops…check logic”
CASE
WHEN xx=xx, THEN ‘yy’
WHEN xx IN (1,3,5) THEN ‘yy’
ELSE ‘uh oh…check logic’
END AS ‘columnName’
Using COUNT & CASE to pivot data (advanced)
excel’s ability to pivot to columns can be replicated in SQL
when using CASE pivoting, we use COUNT() and only count records that match certain criteria
need to use GROUP BY
Example code:
SELECT
film_id,
COUNT (CASE WHEN store_id = 1THEN
inventory_id ELSE NULL END) AS store_1_copies,
COUNT (CASE WHEN store_id = 2THEN
inventory_id ELSE NULL END) AS store_2_copies
COUNT (inventory_id) AS total_copies
GROUP BY
film_id
ORDER BY
film_id
What are two key concepts to structured data??
Normalization - the process of structuring tables and columns in a relational database to minimize redundancy and preserve data integrity.
-Benefits include: eliminating duplicate data, reducing errors and anomalies -In practice normalization involves breaking out data from a single merged table into multiple merged tables
Cardinality - refers to the uniqueness in values in a column (or attribute) of a table, and is commonly used to describe how two tables relate (one to one, one to many, many to many
-Primary keys are unique and cannot repeat.
there is only one instance of each primary key
value in a column
-Foreign keys are non unique. They can repeat so
there can be many instances of each foreign key
value in a column
-We can create a one to many relationship by
connecting a foreign key in one table to a
primary key in another
What are the most common JOIN types?
- INNER JOIN (will use this or LEFT JOIN the most)
- LEFT JOIN (will use this or INNER JOIN the most)
- RIGHT JOIN (rarely used)
- FULL OUTER JOIN
- UNION
*these are all used directly after the FROM statement
**remember, when using JOINs need to specify the table in front of the column (ie table.column)
***all require the use of ON to specify the link between the two tables
table1.column_1=table2.column_1