Final Flashcards

(50 cards)

1
Q

Linear Search (A1) Cost

A

b_r + t_S
One initial seek
b_r block transfers, where b_r is the number of blocks in a file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Linear Search (A1) Cost, Equality on Key Average Case.

A

t_s+ (b_r/2). Average case, scan can be terminated as soon as a record is found.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Primary B+-tree Index search, Equality on Key (A2)

A

(h_i + 1) * t_T
h_i - tree height.
Index traverses tree height and 1 I/O to fetch the record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Primary B+-tree Index, Equality on Nonkey (A3)

A

h_i * t_T + b * t_T
Height of the tree and then however many blocks we get

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

SQL Query Processor Steps

A

Parser -> Translator -> Optimizer -> Evaluator -> Query Output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Predominant Cost in Evaluation

A

Disk Access

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

t_T

A

Time to transfer one block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

t_S

A

Time for one seek

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Generic Cost Algorithm

A

bt_T + St_S

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

b_r

A

Number of blocks containing records from relation r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Secondary Index, Equality on Key (A4)

A

(h_i+1)*t_t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

TF: Index Scans are Search Algorithms

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

b

A

Blocks with records with the specified search key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Secondary Index, Equality on Nonkey (A4)

A

(h_i + n) * t_T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Primary Index Comparison (A5)

A

h_i*t_T + b * t_T
Less than: Start at first tuple and scan until end
Greater than: Use index to find first tuple and scan onwards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Secondary Index Comparison (A6)

A

(h_i+n) * t_T
Same as A5 but you’re finding pointers to records
Linear file scan might be cheaper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Conjunctive Selection with One Index (A7)

A

Select algos in A1-7 for lowest cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Composite Index Matching with Tree Indexes

A

Involves attributes in prefix of search key. Ex, given a,b,c.
a=1, b=2 is good
a=1, c=3 is bad

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Composite Index Matching with Hash Indexes

A

every term must have a value, order does not matter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Disjunctive Selection (A10)

A

Can use if all conditions have indices, else use Linear Scan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Negation

A

Use linear scan

22
Q

Nested Loop Algorithm

A

for t in t_r
for u in t_s
if cond(t,u); then add to result

23
Q

Nested Loop Cost

A

n_r * b_s + b_r

24
Q

Nested Loop Cost if inner relation fits in memory

25
Block Nested Loop Cost
b_r * b_s + b_r
26
Improved Block Nested Loop Cost
[b_r / (M-2)] * b_s + b_r 3 partitions of M needed for outer, inner, and output
27
TF: In a block nested-loop join, if equi-join attribute forms a key on the inner relation, stop the inner loop on the first match
True
28
Index Nested-Loop Join Conditions
Join is equi or natural join Index is available on inner relation's join attribute
29
Index Nested-Loop Join Process
For each tuple in t_r, use the index to look up tuples in s that satisfy the outer join condition
30
Index Nested-Loop Join Process Worst Case
Buffer has space for one page of r and for each tuple in r, we do an index lookup on s b_r + n_r * c c = cost of fetching all matching s tuples for r. Cost of a single selection on s using the join condition.
31
TF: If indices are available on join attributes on r and s, use the relation with fewer tuples as the inner relation
False, as the outer
32
Merge-Join Conditions
Join is equi or natural join. If R and S are sorted, then join can be computed in a similar manner as the merge-sort algo
33
Merge-Join Cost
b_r + b_s + sort cost Worst Case Sort: M log M + N log N
34
Hybrid Merge Join Process
One relation is sorted, and the other is unsorted but has a secondary index on the join attribute. Merge sorted relation with leaf entries of the secondary index
35
Hash Join Process
Partition s using hash function h, reserving 1 block of memory as the output buffer for each partition Do the same with r for each i: - load s_i into memory - build an in-memory hash index on s_i using the join attribute and a different h - Read tuples from r_i one by one - Find the matching s_i using the hash index
36
Hash Join Build Input
Used to build in-memory hash index (s)
37
Hash Join Probe Input
Used to find matches of in-built index (r)
38
When is recursive partitioning needed in the hash join?
When n_h + 1 >= M Also sqrt(b_s) >= M
39
When does hash table overflow occur?
If hash index on s_i is larger than main memory
40
When is hash table overflow common?
- Many tuples in s with same join value. This can be subverted with block nested loop join - Bad hash function
41
Hash Join Cost
3(b_r + b_s). Just b_r + b_s if build input can be kept in memory.
42
Hybrid Hash Join
Keeps first partition of build relation in memory. Most useful when M is much greater than sqrt(b_s)
43
Joins with a conjunctive / disjunctive condition
Nested block joins
44
Duplicate elimination
Sorting - next to each other Hashing - in same bucket
45
TF: Secondary indices must always be dense
True
46
Difference between primary and secondary indices
Primary stored in sequential order, secondary are not
47
TF: The Result of an SQL statement is always a relation
True
48
Cost of External File Sort
2N*passes, where N is the number of pages
49
Runs in first pas of external file sort
N/B, where N is the number of pages, and B is the number of available buffer pages
50
Number of passes needed to sort a file completely
1+ceil(log(N/B,B-1))