SQL Intermediate Flashcards

(66 cards)

1
Q

COUNT

A

Returns the count of only non-null rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Write a query to count the number of non-null rows in the low column.

A

SELECT COUNT(low) AS low FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Write a query that determines counts of every single column.

A

SELECT COUNT(year) AS year,

COUNT(month) AS month,

COUNT(open) AS open,

COUNT(high) AS high,

COUNT(low) AS low,

COUNT(close) AS close,

COUNT(volume) AS volume

FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Write a query to calculate the average opening price

A

SELECT (SUM(open)/COUNT(open)) as avg_open
FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MIN and MAX

A

Depending on the column type, MIN will return the lowest number, earliest date, or non-numerical value as close alphabetically to “A” as possible. As you might suspect, MAX does the opposite—it returns the highest number, the latest date, or the non-numerical value closest alphabetically to “Z.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

select the minimum and maximum volumes of stock

A

SELECT MIN(volume) AS min_volume, MAX(volume) AS max_volume FROM tutorial.aapl_historical_stock_price

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What was Apple’s lowest stock price (at the time of this data collection)?

A
SELECT MIN(low)
FROM tutorial.aapl\_historical\_stock\_price
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What was the highest single-day increase in Apple’s share value?

A
SELECT MAX(close-open)
FROM tutorial.aapl\_historical\_stock\_price
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

AVG

A

calculates the average of a selected group of values. It’s very useful, but has some limitations. First, it can only be used on numerical columns. Second, it ignores nulls completely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SELECT AVG(high) FROM tutorial.aapl_historical_stock_price WHERE high IS NOT NULL

VS

SELECT AVG(high) FROM tutorial.aapl_historical_stock_price

A

No difference as AVG ignores nulls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Write a query that calculates the average daily trade volume for Apple stock.

A
SELECT AVG(volume)
FROM tutorial.aapl\_historical\_stock\_price
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

GROUP BY

A

GROUP BY allows you to separate data into groups, which can be aggregated independently of one another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Calculate the total number of shares traded each month for each year. Order your results chronologically.

A

SELECT year, month, SUM(volume) as total_num_shares
FROM tutorial.aapl_historical_stock_price
GROUP BY 1, 2
ORDER BY 1, 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Using GROUP BY with ORDER BY

A

The order of column names in your GROUP BY clause doesn’t matter—the results will be the same regardless. If you want to control how the aggregations are grouped together, use ORDER BY.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SELECT year, month, COUNT(*) AS count FROM tutorial.aapl_historical_stock_price GROUP BY year, month ORDER BY month, year

VS

SELECT year, month, COUNT(*) AS count FROM tutorial.aapl_historical_stock_price GROUP BY year, month ORDER BY year, month

A

2000 1 20
2001 1 21
2002 1 21
2003 1 21
2004 1 20

vs

2000 1 20
2000 2 20
2000 3 23
2000 4 19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Write a query to calculate the average daily price change in Apple stock, grouped by year.

A

SELECT year,
AVG(close-open) AS avg_change
FROM tutorial.aapl_historical_stock_price
GROUP BY 1
ORDER BY 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Write a query that calculates the lowest and highest prices that Apple stock achieved each month.

A

SELECT year, month,
MAX(high) AS highest,
MIN(low) AS lowest
FROM tutorial.aapl_historical_stock_price
GROUP BY 1, 2
ORDER BY 1, 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

find every month during which AAPL stock where the max was over $400/share.

A

SELECT year, month, MAX(high) AS month_high FROM tutorial.aapl_historical_stock_price GROUP BY year, month HAVING MAX(high) > 400 ORDER BY year, month

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

ORDER of SQL operations

A

SELECT

FROM

WHERE

GROUP BY

HAVING

ORDER BY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

CASE

A

The CASE statement is SQL’s way of handling if/then logic. The CASE statement is followed by at least one pair of WHEN and THEN statements—SQL’s equivalent of IF/THEN in Excel. Because of this pairing, you might be tempted to call this SQL CASE WHEN, but CASE is the accepted term.

Every CASE statement must end with the END statement. The ELSE statement is optional, and provides a way to capture values not specified in the WHEN/THEN statements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

write a query that adds a new column called is_a_senior that returns yes if the player is a senior and no if not

A

SELECT player_name, year,

CASE

WHEN year = ‘SR’ THEN ‘yes’

ELSE

‘no’

END AS is_a_senior

FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Write a query that includes a column that is flagged “yes” when a player is from California, and sort the results with those players first.

A

SELECT player_name,
CASE
WHEN state = ‘CA’ THEN ‘yes’
ELSE ‘no’
END AS from_ca
FROM benn.college_football_players
ORDER BY 2 DESC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

sort players into three weight classes:

over 250

201 - 250

176 - 200

175 and under

A

SELECT player_name, weight,

CASE WHEN weight > 250 THEN ‘over 250’

WHEN weight > 200 AND weight <= 250 THEN ‘201-250’

WHEN weight > 175 AND weight <= 200 THEN ‘176-200’

ELSE ‘175 or under’ END AS weight_group

FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Write a query that includes players’ names and a column that classifies them into four categories based on height. Keep in mind that the answer we provide is only one of many possible answers, since you could divide players’ heights in many ways.

A

SELECT player_name, height,

CASE

WHEN height > 74 THEN ‘over 74’

WHEN height > 72 AND height <= 74 THEN ‘73-74’

WHEN height > 70 AND height <= 72 THEN ‘71-72’

ELSE ‘under 70’

END AS height_group

FROM benn.college_football_players

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Write a query that selects all columns from benn.college\_football\_players and adds an additional column that displays the player's name if that player is a junior or senior.
SELECT \*, CASE WHEN year = 'JR' OR year = 'SR' THEN player\_name ELSE 'not\_jr\_sr' END AS jr\_sr FROM benn.college\_football\_players
26
count the number of students by their year, INCLUDING null values "hint: case"
SELECT CASE WHEN year = 'FR' THEN 'FR' WHEN year = 'SO' THEN 'SO' WHEN year = 'JR' THEN 'JR' WHEN year = 'SR' THEN 'SR' ELSE 'No Year Data' END AS year\_group, COUNT(1) AS count FROM benn.college\_football\_players GROUP BY 1 normally count ignores nulls, by making them a case, you can count them
27
Write a query that counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (Everywhere else).
SELECT CASE WHEN state IN ('CA', 'OR', 'WA') THEN 'west\_coast' WHEN state = 'TX' THEN 'texas' ELSE 'other' END AS region, COUNT(1) as players --count the case to get all the nulls FROM benn.college\_football\_players WHERE weight \>= 300 GROUP BY 1
28
Write a query that calculates the combined weight of all underclass players (FR/SO) in California as well as the combined weight of all upperclass players (JR/SR) in California.
SELECT CASE WHEN year IN ('FR', 'SO') THEN 'fr\_so' WHEN year IN ('JR', 'SR') THEN 'jr\_sr' ELSE NULL END AS year\_in\_school, SUM(weight) as total\_weight --count the case to get all the nulls FROM benn.college\_football\_players WHERE state = 'CA' GROUP BY 1
29
SELECT CASE WHEN year = 'FR' THEN 'FR' WHEN year = 'SO' THEN 'SO' WHEN year = 'JR' THEN 'JR' WHEN year = 'SR' THEN 'SR' ELSE 'No Year Data' END AS year\_group, COUNT(1) AS count FROM benn.college\_football\_players GROUP BY 1 pivot this not using a pivot command
``` SELECT COUNT(CASE WHEN year = 'FR' THEN 1 ELSE NULL END) AS fr\_count, COUNT(CASE WHEN year = 'SO' THEN 1 ELSE NULL END) AS so\_count, COUNT(CASE WHEN year = 'JR' THEN 1 ELSE NULL END) AS jr\_count, COUNT(CASE WHEN year = 'SR' THEN 1 ELSE NULL END) AS sr\_count FROM benn.college\_football\_players ``` Output: fr\_count so\_count jr\_count sr\_count 9665 5881 5665 5087
30
Write a query that displays the number of players in each state, with FR, SO, JR, and SR players in separate columns and another column for the total number of players. Order results such that states with the most players come first.
SELECT state, COUNT(CASE WHEN year = 'FR' THEN 1 ELSE NULL END) AS fr\_count, COUNT(CASE WHEN year = 'SO' THEN 1 ELSE NULL END) AS so\_count, COUNT(CASE WHEN year = 'JR' THEN 1 ELSE NULL END) AS jr\_count, COUNT(CASE WHEN year = 'SR' THEN 1 ELSE NULL END) AS sr\_count, count(\*) as total\_players FROM benn.college\_football\_players GROUP BY state ORDER BY total\_players DESC
31
Write a query that shows the number of players at schools with school names that start with A through M, and the number at schools with names starting with N - Z.
SELECT CASE WHEN school\_name \< 'n' THEN 'A-M' WHEN school\_name \>= 'n' THEN 'N-Z' ELSE NULL END AS school\_name\_group, COUNT(1) AS players FROM benn.college\_football\_players GROUP BY 1
32
For each school give the number of students whose names start with A-m and n-z
SELECT full\_school\_name, COUNT(CASE WHEN player\_name BETWEEN 'A' AND 'M' THEN 1 ELSE NULL END) AS a\_m, COUNT(CASE WHEN player\_name BETWEEN 'N' AND 'Z' THEN 1 ELSE NULL END) AS n\_z FROM benn.college\_football\_players GROUP BY 1
33
SELECT DISTINCT year, month FROM tutorial.aapl\_historical\_stock\_price which values will this return unique ones for
both year and month
34
count the unique values in the month column.
SELECT COUNT(DISTINCT month) AS unique\_months FROM tutorial.aapl\_historical\_stock\_price
35
taking average trade volumes by month
SELECT month, AVG(volume) AS avg\_trade\_volume FROM tutorial.aapl\_historical\_stock\_price GROUP BY month ORDER BY 2 DESC
36
Write a query that counts the number of unique values in the month column for each year.
SELECT year, COUNT(DISTINCT month) AS unique\_months FROM tutorial.aapl\_historical\_stock\_price GROUP BY 1 ORDER BY 1
37
Write a query that separately counts the number of unique values in the month column and the number of unique values in the `year` column.
SELECT COUNT(DISTINCT year) AS unique\_years, COUNT(DISTINCT month) AS unique\_months FROM tutorial.aapl\_historical\_stock\_price
38
Write a query that selects the school name, player name, position, and weight for every player in Georgia, ordered by weight (heaviest to lightest). Be sure to make an alias for the table, and to reference all column names in relation to the alias. get school\_name from teams
SELECT teams.school\_name, players.player\_name, players.position, players.weight FROM benn.college\_football\_players players JOIN benn.college\_football\_teams teams ON players.school\_name = teams.school\_name WHERE players.state = 'GA' ORDER BY players.weight DESC
39
INNER JOIN
nner joins eliminate rows from both tables that do not satisfy the join condition set forth in the ON statement. In mathematical terms, an inner join is the intersection of the two tables. Therefore, if a player goes to a school that isn't in the teams table, that player won't be included in the result from an inner join. Similarly, if there are schools in the teams table that don't match to any schools in the players table, those rows won't be included in the results either.
40
Inner join with two columns of the same name
The results can only support one column with a given name—when you include 2 columns of the same name, the results will simply show the exact same result set for both columns even if the two columns should contain different data. You can avoid this by naming the columns individually.
41
Write a query that displays player names, school names and conferences for schools in the "FBS (Division I-A Teams)" division.
SELECT players.player\_name, players.school\_name, teams.conference FROM benn.college\_football\_players players INNER JOIN benn.college\_football\_teams teams ON teams.school\_name = players.school\_name WHERE teams.division = 'FBS (Division I-A Teams)'
42
Outer joins
unmatched rows in one or both tables can be returned.
43
Left Join
Keep everything from the left database, the first one chosen in the on statement
44
Right Join
45
get schema from postgresql
\d table\_name or \d+ table\_name SELECT table\_name, column\_name, data\_type FROM information\_schema.columns WHERE table\_name = 'city';
46
loginto postgre sgl
psql -U postgres -W postgres is the name of the user for a specific database psql -d database -U user -W
47
switch to the table you want
postgres=# \c dvdrental Password for user postgres: You are now connected to database "dvdrental" as user "postgres".
48
list all available databases in postgre
\l
49
list all available tables
\dt
50
Write a query that performs an inner join between the tutorial.crunchbase\_acquisitions table and the tutorial.crunchbase\_companies table, but instead of listing individual rows, count the number of non-null permalink rows in each table.
SELECT COUNT(companies.permalink) AS companies\_rowcount, COUNT(acquisitions.company\_permalink) AS acquisitions\_rowcount FROM tutorial.crunchbase\_companies companies JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink
51
SELECT COUNT(companies.permalink), COUNT(acquisitions.company\_permalink) FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink VS -- Returns first 100 rows from tutorial.crunchbase\_companies SELECT COUNT(companies.permalink), COUNT(acquisitions.company\_permalink) FROM tutorial.crunchbase\_companies companies JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink
The first will return counts of all values in the left table and only counts of rows in the right table that are equal to the left. The second will return the counts of only rows that are in both table
52
Count the number of unique companies (don't double-count companies) and unique acquired companies by state. Do not include results for which there is no state data, and order by the number of acquired companies from highest to lowest.
SELECT companies.state\_code, COUNT(DISTINCT companies.permalink) AS unique\_companies, COUNT(DISTINCT acquisitions.company\_permalink) AS unique\_companies\_acquired FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink WHERE companies.state\_code IS NOT NULL GROUP BY 1 ORDER BY 3 DESC
53
SELECT companies.permalink AS companies\_permalink, companies.name AS companies\_name, acquisitions.company\_permalink AS acquisitions\_permalink, acquisitions.acquired\_at AS acquired\_date FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink AND acquisitions.company\_permalink != '/company/1000memories' ORDER BY 1 VS SELECT companies.permalink AS companies\_permalink, companies.name AS companies\_name, acquisitions.company\_permalink AS acquisitions\_permalink, acquisitions.acquired\_at AS acquired\_date FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink WHERE acquisitions.company\_permalink != '/company/1000memories' OR acquisitions.company\_permalink IS NULL ORDER BY 1
Compare the following query to the previous one and you will see that everything in the tutorial.crunchbase\_acquisitions table was joined on except for the row for which company\_permalink is '/company/1000memories'. It does return the row from the companies table with that permalink You can see that the 1000memories line is not returned (it would have been between the two highlighted lines below). Also note that filtering in the WHERE clause can also filter null values, so we added an extra line to make sure to include the nulls.
54
Write a query that shows a company's name, "status" (found in the Companies table), and the number of unique investors in that company. Order by the number of investors from most to fewest. Limit to only companies in the state of New York.
SELECT companies.name, companies.status, COUNT(DISTINCT investments.investor\_name) FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_investments investments ON companies.permalink = investments.company\_permalink WHERE companies.state\_code = 'NY' GROUP BY 1,2 ORDER BY 3 DESC
55
Write a query that lists investors based on the number of companies in which they are invested. Include a row for companies with no investor, and order from most companies to least.
56
write a query using a full join that counts the number of rows in the acquisitions table, the companies table and both tables.
SELECT COUNT(CASE WHEN companies.permalink IS NOT NULL AND acquisitions.company\_permalink IS NULL THEN companies.permalink ELSE NULL END) AS companies\_only, COUNT(CASE WHEN companies.permalink IS NOT NULL AND acquisitions.company\_permalink IS NOT NULL THEN companies.permalink ELSE NULL END) AS both\_tables, COUNT(CASE WHEN companies.permalink IS NULL AND acquisitions.company\_permalink IS NOT NULL THEN acquisitions.company\_permalink ELSE NULL END) AS acquisitions\_only FROM tutorial.crunchbase\_companies companies FULL JOIN tutorial.crunchbase\_acquisitions acquisitions ON companies.permalink = acquisitions.company\_permalink
57
Write a query that joins tutorial.crunchbase\_companies and tutorial.crunchbase\_investments\_part1 using a FULL JOIN. Count up the number of rows that are matched/unmatched as in the example above.
SELECT COUNT(CASE WHEN companies.permalink IS NOT NULL AND investments.company\_permalink ISNULL THEN 1 ELSE NULL END)AS company\_only, COUNT(CASE WHEN companies.permalink ISNULL AND investments.company\_permalink IS NOT NULL THEN 1 ELSE NULL END)AS investment\_only, COUNT(CASE WHEN companies.permalink IS NOT NULL AND investments.company\_permalink IS NOT NULL THEN 1 ELSE NULL END)AS both\_tables FROM tutorial.crunchbase\_companies companies FULL JOIN tutorial.crunchbase\_investments\_part1 investments ON companies.permalink = investments.company\_permalink
58
59
What would this result in: SELECT \* FROM tutorial.crunchbase\_investments\_part1 UNION SELECT \* FROM tutorial.crunchbase\_investments\_part2
This would stack the two together ontop of each other. More specifically, when you use UNION, the dataset is appended, and any rows in the appended table that are exactly identical to rows in the first table are dropped. If you'd like to append all the values from the second table, use UNION ALL. Both tables must have the same number of columns The columns must have the same data types in the same order as the first table
60
Write a query that appends the two crunchbase\_investments datasets above (including duplicate values). Filter the first dataset to only companies with names that start with the letter "T", and filter the second to companies with names starting with "M" (both not case-sensitive). Only include the company\_permalink, company\_name, and investor\_name columns.
SELECT company\_permalink, company\_name, investor\_name FROM tutorial.crunchbase\_investments\_part1 WHERE company\_name ILIKE 't%' UNION ALL SELECT company\_permalink, company\_name, investor\_name FROM tutorial.crunchbase\_investments\_part2 WHERE company\_name ILIKE 'm%'
61
Write a query that shows 3 columns. The first indicates which dataset (part 1 or 2) the data comes from, the second shows company status, and the third is a count of the number of investors. Hint: you will have to use the tutorial.crunchbase\_companies table as well as the investments tables. And you'll want to group by status and dataset.
SELECT 'investments\_part1' AS dataset\_name, companies.status, COUNT(DISTINCT investments.investor\_permalink) AS investors FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_investments\_part1 investments ON companies.permalink = investments.company\_permalink GROUP BY 1,2 UNION ALL SELECT 'investments\_part2' AS dataset\_name, companies.status, COUNT(DISTINCT investments.investor\_permalink) AS investors FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_investments\_part2 investments ON companies.permalink = investments.company\_permalink GROUP BY 1,2
62
SELECT companies.permalink, companies.name, companies.status, COUNT(investments.investor\_permalink) AS investors FROM tutorial.crunchbase\_companies companies LEFT JOIN tutorial.crunchbase\_investments\_part1 investments ON companies.permalink = investments.company\_permalink AND investments.funded\_year \> companies.founded\_year + 5 GROUP BY 1,2, 3
Here's an example using \> to join only investments that occurred more than 5 years after each company's founding year:
63
couple reasons you might want to join tables on multiple foreign keys
accuracy speed joins with two keys can increase the speed due to the the way that SQL indexes
64
Identify companies that received investments from great Britain following an investment from japan, using the investments database
SELECT DISTINCT japan\_investments.company\_name, japan\_investments.company\_permalink FROM tutorial.crunchbase\_investments\_part1 as japan\_investments JOIN tutorial.crunchbase\_investments\_part1 as gb\_investments ON japan\_investments.company\_permalink = gb\_investments.company\_permalink AND gb\_investments.investor\_country\_code = 'GBR' AND gb\_investments.funded\_at \> japan\_investments.funded\_at WHERE japan\_investments.investor\_country\_code = 'JPN' ORDER BY 1
65
Convert the funding\_total\_usd and founded\_at\_clean columns in the tutorial.crunchbase\_companies\_clean\_date table to strings (varchar format) using a different formatting function for each one.
``` SELECT CAST(funding\_total\_usd AS varchar), founded\_at\_clean::varchar FROM tutorial.crunchbase\_companies\_clean\_date ```
66