Intermediate SQL Flashcards

1
Q

What are the aggregation functions used in SQL?

A

COUNT counts how many rows are in a particular column.
SUM adds together all the values in a particular column.
MIN and MAX return the lowest and highest values in a particular column, respectively.
AVG calculates the average of a group of selected values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does COUNT command do?

A

COUNT is a SQL aggregate function for counting the number of rows in a particular column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Typing COUNT(1) has the same effect as COUNT(*). Which one you use is a matter of personal preference. True/False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does below code do?
SELECT COUNT(high)
FROM tutorial.aapl_historical_stock_price

A

The code will provide a count of all rows in which the high column is NOT NULL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does below code do?
SELECT COUNT(date) AS count_of_date
FROM tutorial.aapl_historical_stock_price

A

The column header for count(date) is changed from “count” to “count_of_date”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does COUNT command, count null values?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does SUM do?

A

SUM is a SQL aggregate function. that totals the values in a given column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

An important thing to remember: aggregators only aggregate vertically. If you want to perform a calculation across rows, you would do this with simple arithmetic. True/False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does SUM treat nulls?

A

SUM treats nulls as 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do MIN / MAX commands do?

A

MIN and MAX are SQL aggregation functions that return the lowest and highest values in a particular column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

MIN/MAX can be used on non-numerical columns. True/False, if True, how do they do it?

A

True.
MIN will return the lowest number, earliest date, or non-numerical value as close alphabetically to “A” as possible.
As you might suspect, MAX does the opposite—it returns the highest number, the latest date, or the non-numerical value closest alphabetically to “Z.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does AVG command do?

A

AVG is a SQL aggregate function that calculates the average of a selected group of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does GROUP BY do?

A

GROUP BY allows you to separate data into groups, which can be aggregated independently of one another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When is it recommended to use numbers instead of column names for group by?

A

It’s generally recommended to do this only when you’re grouping many columns, or if something else is causing the text in the GROUP BY clause to be excessively long

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

We use HAVING clause with aggregation, because WHERE doesn’t allow you to filter on aggregate columns. True/False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is SQL’s way of handling if/then logic?

A

CASE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The CASE statement is followed by at least one pair of WHEN and THEN statements. True/False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we end a CASE statement?

A

Every CASE statement MUST end with the END statement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The ELSE statement is optional, and provides a way to capture values not specified in the WHEN/THEN statements. True/False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does below code do?
SELECT CASE WHEN year = ‘FR’ THEN ‘FR’
WHEN year = ‘SO’ THEN ‘SO’
WHEN year = ‘JR’ THEN ‘JR’
WHEN year = ‘SR’ THEN ‘SR’
ELSE ‘No Year Data’ END AS year_group,
COUNT(1) AS count
FROM benn.college_football_players
GROUP BY 1

A

It counts the number of players in each year_group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does below code do?
SELECT state,
COUNT(CASE WHEN year = ‘FR’ THEN 1 ELSE NULL END) AS fr_count,
COUNT(CASE WHEN year = ‘SO’ THEN 1 ELSE NULL END) AS so_count,
COUNT(CASE WHEN year = ‘JR’ THEN 1 ELSE NULL END) AS jr_count,
COUNT(CASE WHEN year = ‘SR’ THEN 1 ELSE NULL END) AS sr_count,
COUNT(1) AS total_players
FROM benn.college_football_players
GROUP BY state
ORDER BY total_players DESC

A

displays the number of players in each state, with FR, SO, JR, and SR players in separate columns and another column for the total number of players. Order results such that states with the most players come first.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does below code do?
SELECT CASE WHEN state IN (‘CA’, ‘OR’, ‘WA’) THEN ‘West Coast’
WHEN state = ‘TX’ THEN ‘Texas’
ELSE ‘Other’ END AS arbitrary_regional_designation,
COUNT(1) AS players
FROM benn.college_football_players
WHERE weight >= 300
GROUP BY 1

A

Counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (everywhere else).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When do we use SELECT DISTINCT?

A

You’ll occasionally want to look at only the unique values in a particular column. You can do this using SELECT DISTINCT syntax

23
Q

What will happen if you include two (or more) columns in a SELECT DISTINCT clause?

A

Your results will contain all the unique pairs of those two columns

24
Q

You only need to include DISTINCT once in your SELECT clause—you do not need to add it for each column name. True/False?

A

True

25
Q

Which aggregation function is used with DISTINCT more commonly?

A

You can use DISTINCT when performing an aggregation. You’ll probably use it most commonly with the COUNT function.

26
Q

DISTINCT goes outside the aggregate function rather than at the beginning of the SELECT clause. True/False?

A

False, DISTINCT goes INSIDE the aggregate function rather than at the beginning of the SELECT clause.

27
Q

SUM, AVG, MIN, MAX have very little practical use with DISTINCT, True/False?

A

True, SUM or AVG the distinct values in a column, but there are fewer practical applications for them. For MAX and MIN, you probably shouldn’t ever use DISTINCT because the results will be the same as without DISTINCT, and the DISTINCT function will make your query substantially slower to return results.

28
Q

It’s worth noting that using DISTINCT, particularly in aggregations, can slow your queries down quite a bit. True/False

A

True

29
Q

How can we give a table an alias?

A

benn.college_football_teams teams

30
Q

Once you’ve given a table an alias, you can refer to columns in that table in the SELECT clause using the alias name. True/False?

A

True

31
Q

How can we select columns using table aliases? Code

A

SELECT teams.conference AS conference,
AVG(players.weight) AS average_weight

32
Q

What should we do when 2 tables have columns with the same name and we want to join these tables and see these 2 columns?

A

The results can only support one column with a given name—when you include 2 columns of the same name, the results will simply show the exact same result set for both columns even if the two columns should contain different data. You can avoid this by naming the columns individually.
SELECT players.school_name AS players_school_name,
teams.school_name AS teams_school_name
FROM benn.college_football_players players
JOIN benn.college_football_teams teams
ON teams.school_name = players.school_name

33
Q

Why is right join rarely used?

A

RIGHT JOIN is rarely used because you can achieve the results of a RIGHT JOIN by simply switching the two joined table names in a LEFT JOIN

34
Q

Why do we use COUNT(DISTINCT companies.permalink) to count distinct company names instead of using COUNT(DISTINCT companies.company_name) on MODE website?

A

Because it returned results much faster

35
Q

What’s the difference between below two codes?
FROM tutorial.crunchbase_companies companies
LEFT JOIN tutorial.crunchbase_acquisitions acquisitions
ON companies.permalink = acquisitions.company_permalink
AND acquisitions.company_permalink != ‘/company/1000memories’

FROM tutorial.crunchbase_companies companies
LEFT JOIN tutorial.crunchbase_acquisitions acquisitions
ON companies.permalink = acquisitions.company_permalink
WHERE acquisitions.company_permalink != ‘/company/1000memories’
OR acquisitions.company_permalink IS NULL
.

A

In the first one the conditional statement AND… is evaluated before the join occurs. You can think of it as a WHERE clause that only applies to one of the tables.
In the second one, the filter happens after the tables are joined

36
Q

How can we count NULL values in a table in SQL? What should we write in the SELECT section?

A

By using COUNT(CASE…)
For example: COUNT(CASE WHEN column IS NULL THEN ‘It’s empty’)

37
Q

What’s the difference between join and union?

A

SQL joins allow you to combine two datasets side-by-side, but UNION allows you to stack one dataset on top of the other. Put differently, UNION allows you to write two separate SELECT statements, and to have the results of one statement display in the same table as the results from the other statement.

38
Q

“UNION only appends distinct values.” What does this mean?

A

When you use UNION, the dataset is appended, and any rows in the appended table that are exactly identical to rows in the first table are dropped.

39
Q

What can we do when we want all the values (even duplicates) in the second table, to be appended to the first table?

A

If you’d like to append all the values from the second table, use UNION ALL.

40
Q

You’ll likely use UNION ALL far more often than UNION. True/False?

A

True

41
Q

What are the rules of appending datasets in SQL?

A

SQL has strict rules for appending data:

1- Both tables must have the same number of columns
2- The columns must have the same data types in the same order as the first table

42
Q

How can we create a column and fill it with a fixed value?

A

SELECT ‘investments_part1’ AS dataset_name,
FROM table

43
Q

What are 2 reasons for using multiple keys for joining datasets?

A

There are couple reasons you might want to join tables on multiple foreign keys. The first has to do with accuracy. The second reason has to do with performance.

44
Q

What does below code do?

SELECT DISTINCT japan_investments.company_name,
japan_investments.company_permalink
FROM tutorial.crunchbase_investments_part1 japan_investments
JOIN tutorial.crunchbase_investments_part1 gb_investments
ON japan_investments.company_name = gb_investments.company_name
AND gb_investments.investor_country_code = ‘GBR’
AND gb_investments.funded_at > japan_investments.funded_at
WHERE japan_investments.investor_country_code = ‘JPN’
ORDER BY 1

A

to identify companies that received an investment from Great Britain following an investment from Japan.

45
Q

You only need to include DISTINCT once in your SELECT clause—you do not need to add it for each column name. True/False?

A

True

46
Q

What will happen If you include two (or more) columns in a SELECT DISTINCT clause?

A

Your results will contain all the unique pairs of those two columns

47
Q

When do we use SELECT DISTINCT?

A

You’ll occasionally want to look at only the unique values in a particular column. You can do this using SELECT DISTINCT syntax

48
Q

What does below code do?
SELECT CASE WHEN state IN (‘CA’, ‘OR’, ‘WA’) THEN ‘West Coast’
WHEN state = ‘TX’ THEN ‘Texas’
ELSE ‘Other’ END AS arbitrary_regional_designation,
COUNT(1) AS players
FROM benn.college_football_players
WHERE weight >= 300
GROUP BY 1

A

Counts the number of 300lb+ players for each of the following regions: West Coast (CA, OR, WA), Texas, and Other (everywhere else).

49
Q

What does below code do?
SELECT state,
COUNT(CASE WHEN year = ‘FR’ THEN 1 ELSE NULL END) AS fr_count,
COUNT(CASE WHEN year = ‘SO’ THEN 1 ELSE NULL END) AS so_count,
COUNT(CASE WHEN year = ‘JR’ THEN 1 ELSE NULL END) AS jr_count,
COUNT(CASE WHEN year = ‘SR’ THEN 1 ELSE NULL END) AS sr_count,
COUNT(1) AS total_players
FROM benn.college_football_players
GROUP BY state
ORDER BY total_players DESC

A

displays the number of players in each state, with FR, SO, JR, and SR players in separate columns and another column for the total number of players. Order results such that states with the most players come first.

50
Q

What does below code do?
SELECT CASE WHEN year = ‘FR’ THEN ‘FR’
WHEN year = ‘SO’ THEN ‘SO’
WHEN year = ‘JR’ THEN ‘JR’
WHEN year = ‘SR’ THEN ‘SR’
ELSE ‘No Year Data’ END AS year_group,
COUNT(1) AS count
FROM benn.college_football_players
GROUP BY 1

A

It counts the number of players in each year_group

51
Q

The ELSE statement is optional, and provides a way to capture values not specified in the WHEN/THEN statements. True/False?

A

True

52
Q

How do we end a CASE statement?

A

Every CASE statement MUST end with the END statement.

53
Q

The CASE statement is followed by at least one pair of WHEN and THEN statements. True/False?

A

True

54
Q

What is SQL’s way of handling if/then logic?

A

CASE

55
Q

We use HAVING clause with aggregation, because WHERE doesn’t allow you to filter on aggregate columns. True/False?

A

True