SQL Flashcards

1
Q

Calculate median in SQL

A

with t1 as (
select
row_number() as rn,
count() over () as cnt,
round(count(
) over () / 2) as mid
from table
)

select * from t1
where rn = mid
or (cnt % 2 = 0 and rn = mid + 1)

– Pull where rn = md, so the middle number. This works if dataset is odd.
– If dataset is even, pull where the rn is mid AND another row mid + 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Tips for Optimizing a Query

A
  • Minimize full table scans
  • Filter data
  • Use CTE’s
  • Temp tables
  • Avoid joins
  • User inner (limits result set)
  • No select *
  • Avoid distinct / union
  • Materialize frequently used queries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Get difference between two times

A

timestampsidff(
second,
time1,
time2
)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Get min/max between two or more columns

A
  • least()
  • greatest()
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of language is SQL

A

Declarative (as opposed to procedural)

You describe the results you want - NOT how to do it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Group strings

A

group_contact()
list_agg()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Scalar function to add strings together

A

concat()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Result of FULL OUTER JOIN where join keys are null

A

All from A and all from B but NOT where A and B intersect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Result of FULL OUTER JOIN

A

All from A and B and intersection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

count(*) vs count(1) vs count(col)

A
  • count(*): All rows including nulls
  • count(1): All rows including nulls
  • count(col): All rows EXCEPT nulls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Find start and end of continuous ranges

1
2
3
7
8

A

with t1 as (
select
log_id,
log_id - row_number() over (order by log_id asc) as grp
from logs
)

select
min(log_id) as start_id,
max(log_id) as end_id

from t1
group by grp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Recursive function in SQL

A

with recursive t1 as (

select 1 as col1
union all
select col1 + 1 from t1 where col1 < 10

)

select * from t1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Partial Dependencies

A

When non-key is dependent on part of a candidate key

studentid | projectno | studentname | projectname

studentname and projectname are both dependent on the ids in the table.

Problem with this is if we don’t have any students assigned to a project, then we can’t get the project name from anywhere

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Pros of 3NF

A
  • Fast writes
  • Better organization
  • Reduction of redundant data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2nd Normal Form Rules

A
  • Be in 1NF
  • Have no partial dependencies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fan-out Trap

A

Two 1-to-many joins

17
Q

Where to use an Index

A
  • Typically on columns in WHERE and JOIN clauses
  • Column that appears in max number of queries
18
Q

When would Indexing fail

A
  • Have too many indexes
  • If index column is hardly queried
  • Too many DML actions
19
Q

Nested Joins

A

For every value in OUTER, look for it in INNER

Can be extremely slow

20
Q

Cache

A

Temporarily storing an expressive operation to speed up results

21
Q

Repeatable Read Isolation Level

A

Can read within same transaction without anything changing

22
Q

Read Uncommitted Isolation Level

A

Can read uncommitted data (dirty reads)

23
Q

Read Committed Isolation Level

A

Each read gets latest data committed

24
Q

Purpose of database normalization

A
  • Eliminating redundant data
  • Ensure data dependencies make logical sense
  • Prevent insert, update and removal anomalies
25
Q

Chasm Trap

A

When 2 many-to-one joins converge on a single table. Pathway between is broke

26
Q

Merge Join

A

Sort table on join columns

For each value in OUTER, iterate over INNER to find, but, stop when reach a value greater than what we’re searching for (since it’s in order)

27
Q

Deletion Anomaly

A

Deletion of unwanted data causes loss of desired data

If there are partial dependencies (i.e. teacher phone number in classes table), having no teachers with classes means we no longer have their phone number

28
Q

1st Normal Form Rules

A
  • Each cell has a single value
  • Column should be of all same type
  • Unique column names
29
Q

Update Anomaly

A

Miss updating all records in a table

30
Q

Hash Join

A

Apply a hash function to both INNER and OUTER tables to create hash tables. Can then lookup each value in O(1)

Takes more time as the tables need to be hashed first

31
Q

Cardinality

A
  • Uniqueness of data
  • Low: More duplicates
  • High: More unique
32
Q

Dirty reads

A

When a transaction reads data that is written to a concurrently uncommitted transaction

33
Q

3NF Cons

A
  • Highly normalized
  • Many joins
  • Setup high
34
Q

Boyce-Codd Normal Form

A
  • Non-prime attributes must depend on all candidate keys (super keys)

i.e. If a key is NOT the primary key, but it could have been

35
Q

3rd Normal From Rules

A
  • Must be in 2NF
  • No transative functional dependencies (when a non-primary key depends on another non-primary key)
36
Q

4th Normal Form

A
  • Should not contain any multi-valued dependencies
  • Looks like a cross join

A multi-valued dependency occurs when a single attribute determines more than one other attribute, and these attributes are independent of each other. For example, if a table stores the hobbies and skills of employees, and each employee can have multiple hobbies and skills, then there is a multi-valued dependency between the employee ID and the hobbies and skills. To achieve 4NF, we need to split the table into two separate tables, one for hobbies and one for skills, and link them with a foreign key.

37
Q

5th Normal Form

A
  • Cannot look like a join of tables
  • Basically, a table is in 5th normal form if it cannot be made into any smaller tables with different keys (most tables can obviously be made into smaller tables with the same key!).

Fifth normal form (5NF) is a level of database normalization that ensures that a table has no join dependencies. A join dependency occurs when a table can be decomposed into two or more smaller tables, and then reconstructed by joining them on their primary keys, without losing any information. For example, if a table stores the products, colors, and sizes that a company sells, and each product can have multiple colors and sizes, then there is a join dependency between the product ID, the color, and the size. To achieve 5NF, we need to split the table into three separate tables, one for products, one for colors, and one for sizes, and link them with a composite key. The benefit of 5NF is that it ensures that the database is fully normalized and has no redundancy or anomalies. The drawback of 5NF is that it may create too many tables and relationships, and make the database design more difficult to understand and maintain.

38
Q

Window functions RANGE vs ROW

A

Use ROWS when you want to define your window in terms of a specific number of rows, regardless of their values.

Use RANGE when you want to define your window based on the values in the ORDER BY column. But be cautious about non-uniform value distributions or duplicates!