sql Flashcards

(80 cards)

1
Q

What is SQL?

A

Structured Query Language, used to manage and query relational databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is BigQuery?

A

A cloud-based data warehouse service by Google for querying large datasets using SQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you create a client object in BigQuery?

A

Use client = bigquery.Client() in Python.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a dataset in BigQuery?

A

A container for tables, similar to a folder in a file system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a table schema?

A

The structure of a table, including column names, data types, and descriptions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you list tables in a dataset?

A

Use list(client.list_tables(dataset)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does SELECT do?

A

Retrieves specified columns from a table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does FROM specify?

A

The table from which to retrieve data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does WHERE do?

A

Filters rows based on a condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you select all columns?

A

Use SELECT *.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is GROUP BY used for?

A

Groups rows with the same values into summary rows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does COUNT() do?

A

Returns the number of rows in a group or column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is HAVING used for?

A

Filters groups based on aggregate conditions (used after GROUP BY).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does ORDER BY do?

A

Sorts the result set by specified columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does ASC/DESC mean?

A

ASC: ascending order (default), DESC: descending order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you extract part of a date?

A

Use EXTRACT, e.g., EXTRACT(DAY FROM date_column).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a DATE format?

A

YYYY-MM-DD (e.g., 2019-01-01).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a DATETIME format?

A

YYYY-MM-DD HH:MM:SS (includes time).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is aliasing in SQL?

A

Renaming a column or table using AS for readability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a CTE (Common Table Expression)?

A

A temporary named result set defined within a query using WITH.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you join two tables?

A

Use JOIN with ON to specify matching columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an INNER JOIN?

A

Returns only rows with matching values in both tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a LEFT JOIN?

A

Returns all rows from the left table and matched rows from the right table (NULL if no match).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a RIGHT JOIN?

A

Returns all rows from the right table and matched rows from the left table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is a FULL JOIN?
Returns all rows when there is a match in either table (NULL for non-matching sides).
26
What is a UNION?
Combines results of two or more SELECT queries vertically (columns must match).
27
What is the difference between UNION and UNION ALL?
UNION removes duplicates; UNION ALL keeps all rows, including duplicates.
28
What are aggregate functions?
Functions like COUNT, SUM, AVG, MIN, MAX that operate on multiple rows.
29
What are analytic (window) functions?
Functions that perform calculations across rows related to the current row (e.g., ROW_NUMBER, SUM with OVER).
30
What does OVER() do?
Defines the window or set of rows for an analytic function.
31
What is PARTITION BY in window functions?
Divides rows into partitions to perform calculations separately.
32
What is ROWS BETWEEN in window functions?
Defines the range of rows relative to the current row for calculations.
33
What is nested data in BigQuery?
A column with type STRUCT or RECORD containing nested fields.
34
What is repeated data in BigQuery?
A column with mode REPEATED, containing arrays of values.
35
How do you flatten repeated data?
Use UNNEST(column_name) to expand arrays into separate rows.
36
How can you estimate query cost in BigQuery?
Use `dry_run=True` in QueryJobConfig to see bytes processed.
37
How do you limit bytes billed in BigQuery?
Set `maximum_bytes_billed` in QueryJobConfig.
38
Why should you avoid SELECT * in large tables?
It scans all columns, increasing cost and time.
39
What is an N:N JOIN?
A join where many rows in one table match many rows in another, potentially exploding result size.
40
What is a 1:1 JOIN?
Each row in one table matches at most one row in the other.
41
What is an N:1 JOIN?
Many rows in one table match a single row in another table.
42
What is LEAD()?
An analytic function that returns the value from a following row.
43
What is LAG()?
An analytic function that returns the value from a preceding row.
44
What is ROW_NUMBER()?
Assigns a unique sequential integer to each row within a partition.
45
What is RANK()?
Assigns a rank to each row, with gaps for ties (same rank, next rank skips).
46
What is DENSE_RANK()?
Assigns ranks without gaps (consecutive integers even with ties).
47
How do you filter grouped results without HAVING?
Use a subquery or CTE, but HAVING is the standard way.
48
What is the difference between WHERE and HAVING?
WHERE filters rows before grouping; HAVING filters groups after aggregation.
49
How do you handle NULL values in WHERE?
Use IS NULL or IS NOT NULL (not = NULL).
50
What does the EXTRACT function return?
A part of a date/time, such as YEAR, MONTH, DAY, HOUR, etc.
51
How do you convert a DATETIME to DATE?
Use DATE(datetime_column) or EXTRACT(DATE FROM datetime_column).
52
What does the DATE() function do?
Extracts the date part from a DATETIME or TIMESTAMP.
53
What is the purpose of backticks (`) in BigQuery?
To quote identifiers like table names containing special characters or reserved keywords.
54
What is the difference between STRING and INTEGER in BigQuery?
STRING stores text; INTEGER stores whole numbers.
55
How can you improve query performance?
Select only needed columns, filter early, avoid N:N joins, use efficient joins.
56
What is the role of the query optimizer?
It interprets and executes queries in the most efficient way possible.
57
How do you count distinct values?
Use COUNT(DISTINCT column_name).
58
What is the difference between COUNT(1) and COUNT(column)?
COUNT(1) counts all rows; COUNT(column) counts non-NULL values in that column.
59
How do you create a moving average?
Use AVG() with OVER and ROWS BETWEEN clause.
60
What does FIRST_VALUE() do?
Returns the first value in an ordered partition.
61
What does LAST_VALUE() do?
Returns the last value in an ordered partition.
62
How do you concatenate strings in SQL?
Use the CONCAT() function or || operator (depends on SQL dialect).
63
What is the default sorting order of ORDER BY?
Ascending (ASC).
64
How do you round numbers in SQL?
Use ROUND(number, decimals).
65
What is the purpose of LIMIT?
Restricts the number of rows returned by a query.
66
How do you skip rows in a result?
Use OFFSET with LIMIT (e.g., LIMIT 10 OFFSET 5).
67
What is a subquery?
A query nested inside another query (in SELECT, FROM, or WHERE).
68
How do you delete duplicate rows?
Use DISTINCT in SELECT or GROUP BY to remove duplicates.
69
What is a primary key?
A unique identifier for each row in a table (not explicitly defined in BigQuery but conceptually used).
70
What is a foreign key?
A column that references the primary key of another table to establish a relationship.
71
How do you update data in a table?
Use UPDATE statement (but BigQuery is append-optimized; prefer INSERT or MERGE).
72
What is the MERGE statement?
Combines INSERT, UPDATE, and DELETE operations into a single statement.
73
How do you insert data into a table?
Use INSERT INTO table_name (columns) VALUES (values).
74
What is a view?
A virtual table defined by a query, not storing data itself.
75
How do you create a view?
Use CREATE VIEW view_name AS SELECT ...
76
What is materialized view?
A view that stores the result physically for faster access (BigQuery supports this).
77
What is clustering in BigQuery?
Organizing data based on column values to improve query performance and reduce cost.
78
What is partitioning in BigQuery?
Dividing a table into segments (partitions) based on a column (e.g., date) for efficiency.
79
How do you create a partitioned table?
Use PARTITION BY clause when creating a table (e.g., PARTITION BY DATE(date_column)).
80
What is the difference between clustering and partitioning?
Partitioning divides data into segments; clustering sorts data within partitions for better compression and scanning.