Sorting, Searching and Selection Flashcards

1
Q

Theorem for the lower bound of sorting algorithms

A

For any comparison-based sorting algorithm there exists at least one input of length n that requires omega(n*log(n)) comparisons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does bucket sort work?

A

Suppose the values you want to sort are between 0 and k-1
(3, 2, 1, 2, 0, 3, 2, 1, 4, 0, 4, 3, 0)

Create k queues (buckets) B[0] to B[k-1]

Go through the array placing the values in their appropriate buckets
B[0] = 0, 0, 0 | B[1] = 1, 1 | B[2] = 2, 2, 2 | B[3] = 3, 3, 3 | B[4] = 4, 4

Empty the buckets in order
0, 0, 0, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4

If we’re not sorting numbers but rather arbitrary things, then we just assign a numerical key to each item and use that for sorting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain the time complexity of bucket sort

A

Worst case: O(n+k) where k = bound, n = number of items
Creating buckets: O(k)
Putting items into buckets: O(n)
“Spilling” buckets: O(n)
If k gets really large then the running time is dominated by O(k)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is radix sort?

A

When sorting numbers, create as many buckets as there are digits in the numbers (maximum 10 in base-10)
Bucket-sort by the rightmost digit
Bucket-sort again, this time by the second rightmost digit
The number of rounds depends on the length of the longest digit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the time complexity of radix sort

A

If there are k buckets, then the biggest number has d = log_10(k) digits. We perform bucket sort (O(n)) d times, giving us a time complexity of O(dn) = O(n*log(k))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Compare radix sort to bucket sort

A

Radix sort: O(nlog(k))
Bucket sort: O(n+k)
if k = n, it’s O(n
log(n)) vs O(2n) and bucket sort wins
if k = n^2, it’s O(n*log(n)) vs O(n^2) and radix sort wins
if k = 2^n, it’s O(n^2) vs O(2^n) and radix sort wins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How would you search for x in a sorted list length n using binary search?

A

Look at position ceil(n/2); if it contains x then we’re done
If x is bigger than this middle value, then recursively binary search the top half of the list
If x is smaller than this middle value, then recursively binary search the bottom half of the list.
O(log(n)) time complexity vs O(n) for linear sort

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is selection?

A

The process of finding the i’th smallest element in an unsorted array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe how QuickSelect works

A

Searching for the i’th smallest item in A:

Call a partition function to get the index of the pivot item
If the pivot index is i, return A[i]
If the pivot index is greater than i, QuickSelect the part of the list before the pivot
If the pivot index is smaller than i, QuickSelect the part of the list after the pivot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

QuickSelect performance

A

Worst case (already sorted and rightmost is pivot): the part being recursed into is only one element smaller than the current input. T(n) = T(n-1) + O(n), T(n) = O(n^2)

Average running time is O(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does Median-of-Medians work?

A

Same as QuickSelect, but to find the pivot index you divide the n elements into groups of 5, and then find the medians of these groups. You then use QuickSelect to select the x’th item in the list of medians

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Median-of-Medians analysis

A

At least half of the group medians are less than x, which is floor(floor(n/5)/2) = floor(n/10) group medians
at least 3floor(n/10) elements are less than x
at least 3
floor(n/10) elements are greater than x
The second recursive call is executed on at most (n-3)*floor(n/10) elements

(n-3)*floor(n/10) < (n-3)(n/10 - 1) <= 7n/10 + 3

T(n) = O(n) (medians of 5-element groups) + T(n/5) (selecting xth median in the list of medians) + O(n) (partitioning) + T(7n/10 + 3) (rest of the code)

T(n) = T(n/5) + T(7n/10 + 3) + n
T <= cn therefore T is O(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly