Week 10 + 11 + 12 Flashcards
Define Map ADT.
Collection where each value is “mapped” to a unique key. You need a key-value pair to insert new mapping. You only need the key to find or remove mappings.
ADT Functions:
size()
isEmpty()
get(k)
put(k, v)
remove(k)
keySet()
entrySet()
values()
Explain unsorted map implementation. What is the time complexity?
We store the key-value pairs in a doubly linked list in an arbitrary order. O(n) linear search.
Explain hashing concept.
We implement Map as an array. Where key-values pairs are distributed evenly throughout the array by a hash function.
What two parts is the hash function a composition of?
Hash Code: keys -> integers
Compression Function: integers -> [0, N - 1]
How does Chain Hash Map handle collisions?
Map array is called a bucket array where each entry stores a pointer to another type of data structure. (Usually a linked list of AVL Tree). When a key-value pair maps to a bucket we simply add it to the data structure. We keep efficiency by keeping the size of these data structures small (i.e. less collisions).
How does Probe Hash Map handle collisions>
Open addressing is a collision resolution strategy where collisions are resolved by storing the
colliding key in a different location when the natural choice is full. Linear probing is an example of
open addressing, where we ‘probe’ linearly from the natural location until a suitable location is found. The advantage of this form is no auxiliary data structure is needed.
Give code for polynomial hashcode using constant N used on String?
int h = 0;
for(int i = 0; i < String.length(); i++){
h = h * N + ((int) String.charAt(i));
}
return h;
Code for MAD compression function.
(int) ((Math.abs(key.hashCode() * scale + shift) % prime) % capacity
Where prime > capacity
And scale, shift chosen at random from range [0,prime - 1]
What is collision?
Two keys mapping to the same location in a hash table.
How can collision be reduced?
A good hash function.
What is the time complexity of a Chain Hash Map.
O(ceiling(n / N))
What is the load factor?
l.f. = n / b = the ratio of the number of items in the table to the total number of buckets.
What is time complexity of Chain Hash Map when l.f. is below 1.
O(1)
What is the advantage of open addresing (probe hash map) over separate chaining (chain hash map)?
Don’t use an auxiliary data structure to hole entries with colliding keys. Open addressing has a much lower space complexity as a result. Benefits from caching as stored in memory sequentially.
Which type of hash map is ideal for a load factor above 1?
Chain Hash Map. Probe Hash Map cannot function if load factor is above 1.
Explain linear probing.
If try inset an entry into bucket A[j] but it is occupied we try A[j + 1 mod N] then A[j + 2 mod N] and so on. More collisions leads to longer sequence of probes.
If we try to get we must probe from given position until keys match.
When deleting we leave a special defunt marker so the probe skips over this element, later if key not found remembers this defunct as is a valid place to insert.
What is the formula for the expected number of probes?
1 / 1 - load factor
Name one advantage and one disadvantage of hashtables over other data structures.
A: Hashtables are much faster with larger tables or predictable datasets than other data structures.
D: Ordering not guaranteed. Inefficient when small number of entries or lots of collisions.
What is the BST property?
For a given node z, all nodes in the left subtree of z are less than z and all nodes in the right subtree of z are greater than z.
Explain binary search.
Recursive algorithm, if node = value we are looking for terminate. Else search in the left subtree (if value < current node) or in the right subtree (if value > current node). If we reach a leaf, we return the position where we expected to find value.
Define Sorted Map ADT.
A collection of unique key-value pairs where keys are maintained in sorted order. Supports accessing keys by relative position.
ADT functions.
firstEntry()
lastEntry()
ceilingEntry()
floorEntry()
lowerEntry()
HigherEntry()
subMap(k1, k2)
What is the complexity of treeSearch?
O(h) where h is the height of the tree.
Explain BST insertion.
Call tree search if node is found with given key override value else insert where treeSearch expected it to be found.
Explain BST deletion.
Use treeSearch to find node to be deleted.
If leaf simply delete.
If only one child replace with child.
If two children replace with predecessor (guaranteeing at most only one child) and then replace with child or if no children delete.