# Tables and Hashing Flashcards

1
Q

What is a ADT Table: Data?

A
• A table is a collection of entries, where each entry contains:
• a key(unique identifier for the entry)
• rest of the entry (other data associated with this entry

Sometimes called a dictionary

2
Q

What are a table’s standard operations?

A
3
Q

What are the posibiities for implementing a table and what are the advantages/disadvantages?

A
4
Q

What is the most important table operation?

A

search/retrieve

5
Q

How fast is a hash table, insert, delete and search?

A

O(1)

Means that execution time is independent of problem size, it takes a constant number of operations

6
Q

What is the general idea of how to implement an ADT table using a hash table?(psudo code)

A
7
Q

What is a Hash function?

A

A mathematical function that, given a key k gives an index h(k) into the hash table(an array) where the item with key k can be found.

Example:

• Keys are in 0,1,2,3,…
• So use h(k) = k
• Search, insert, delete would all take O(1) time in an array
8
Q

What are the potential problems with the simple hash function?

A

The keys could be:

• Large (compared to a smaller array size)
• Alpha-numeric(non-numeric, at least)
• Collisions (two keys hashing to the same location)
9
Q

What must a good hash function do?

A
• Minimize collisions
• fast to compute
• Hash values should be evenly distributed, even if the keys are not
10
Q

What are the two steps of a hash function?

A
1. Make the key an integer(called a hash code) - If it isn’t an integer, map the key to some integer
2. Make the integer into an array index(compression map) - Map the hash code into the desired array index range.
11
Q

When turning a key into a hash code that has the same size of int, such as short, char and float?

A

treat the key as an int, cast it to an int

12
Q

Hash code: how do you turn a key that is longer than an int into a hash code?(example double or long)

A
• Split the key into int-size pieces and add them together
• Can’t just use one piece as the hash code, that piece may be identitical or similar to a lot of different keys, leading to clustering and collisions
13
Q

How do you turn a non-integer key into an integer hash code?

A
1. Divide the key into a sequence of int-sized chunks
2. Treat each chunk as an int
• example String:
• Get the character at position i: word.charAt(i)
• cast the character at position i into an int: (int)word.charAt(i)
3. Turn the sequence of integers into one integer using the polynomial hash code
14
Q

Why use a polynomial hash for Strings? Why can’t we just add the characters together?

A

The polynomial hash code gives uniform distribution on english words.

Just adding the characters together ignores the position (commutative property) which differentiates words. “stop”, “pots”,”opts” would all hash to the same index

15
Q

How can you computer the polynomial hash code efficiently? What’s the difference in speed?

A

The inefficient is (s(s+1))/2 multiplications

Factor out the common factor (a) using Horner’s method.

With Horner’s method there is S additions and S multiplications

16
Q

How does Horner’s method work? Provide an example using the String “cat”

A

Processes the characters left to right, starting with hash code = 0

for each character:

• Multiply the current hash code by a and then
• add on to the next character
• do the computation mod the table size at each step so that the value computed doesn’t cause an integer overflow
17
Q

What are the problems with a simple compression map

(using h(k) = |k| mod N(has table size)

A
• If hash code k is even, then h(k) is even, meaning it will only hash to an even key in the table
• If the table is divisible by 5 (N = 10) then only positions 0 or 5 will be used in the hash table
• In general, if k and N are divisible by b, then h() is divisible by b (leading to many collisions)
18
Q

What is the problem with Polynomial Hash Code, Horner’s Method and Overflow? What is the solution?

A

Problem: When finding the polynomial hash code for a longer string, you can get an integer overflow quite quicky - all those large powers of a add up quickly

Solution: After every step in Horner’s method, take the current computer value mod N and use that in the next step

(current value*a+next letter) mod N