# hash tables Flashcards

1
Q

what is a hash table

A

data structure where items are stored in a location determined by their content - content based inde

2
Q

what is every hash table fundamentally composed of?

A

array of items

hash function: that converts item to an index

3
Q

what is a hash function

A

is a mapping from an item to an integer

4
Q

what are the requirements for a good hash function?

A

fast compute
deterministic
spread keys evenly in hash table - no large gaps

5
Q

what is a perfect has function

A

maps an every distinct key onto a distinct integer

minimal perfect hash function has no holes in the array

6
Q

A

7
Q

explain linear probing insert

A

generate hash code = h

while A[h] contains a key A[h] - {key, value}

8
Q

explain linear probing find

A
```generate hash code = h
while A[h] contains a key:
-- if A[h]{key} == key  return value
-- h = (h+1)%table size
9
Q

issues with linear probing

A
• primary clustering - multiple adjacent items and slow performance
• uneven gaps in the table
• full table
10
Q

what to do when the table becomes full?

A

throw an error
create new and larger table
automatically make a larger array and then rehash every thing

11
Q

A

the proportion of the table that is fill (λ)

12
Q

probability of a cell being empty

A

1-λ

13
Q

average number of cells to be examined for insert

A

T(λ) = (1+1/(1-λ)^2)/2

14
Q

average number of cells to be examined for successful find

A

T(λ) = (1+1/(1-λ))/2

15
Q

A

generate hash code h = hash(key), i =1
while a[h] contains a key
h = h+i*i%table size
a[h] = key, value

16
Q

A
```generate hash code h = hash(key), i =1
while a[h] contains a key
-- if A[h]{key} == key  return value
--- h = h+i*i%table size
--- i++```

17
Q

basic idea of collision resolution by chaining

A

use a table of pointers/references

each new item must be added to a linked list at that position in the table

18
Q

insertion algorithm for chaining

A

generate hash code h = hash(key), p = new Node
p.data = {key, value}; p.next = A[h]
A[h] = p

19
Q

find algorithm for chaining

A

generate hash code h = hash(key), p = A[h]
while p != null and p{key} != key
— p = p.next
return p

20
Q

analyze the chaining technique

A

n items and m table size - even distribution of hash values and use of all possible hash values
on ave each linked list is of size n/m
average search time - 1/2 n/m

21
Q

complexity analysis of chaining

A
```best - O(n/m)
ave - 1/2 n/m
worst - O(n)
m - tablesize
n - number of items```

All dependent on choice of M relative to n

22
Q

handling deletions open addressing vs chaining

A

• deletion not easily possible
• add a flag to mark deleted items - skip deleted items during search, unmark and over write deleted items on insert

chaining:

23
Q

other variations to chaining

A

double hashing - if there is a collision because of due to secondary clustering use a second hash function when there is a collision
h = H1 +H2 - where H2 is the rehashing function
a different sequence is checked for each key

chaining using BST or other data structures

24
Q

what is secondary clustering?

A

where a key generates the same sequence of locations to check