Dictionaries and Hash Tables Flashcards

(16 cards)

1
Q

What is a hash table?

A

A data structure that stores a key/value pair based on an index calculated from a hashing algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a collision in hash tables?

A

When two key values compute to the same hash value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are properties of good hash functions?

A
  • Make use of all info provided by the key
  • Uniformly distribute output across table to reduce clustering
  • Maps similar keys to very different hash values
  • Always give same hash value for a key
  • Make use of only computationally fast operations
  • Handle keys of any data type
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can collisions be dealt with?

A
  • (Rehashing)?
  • Linear probing
  • Seperate Chaining
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is linear probing and what are its disadvantages?

A
  • Look for the next available index in hash and insert item there
  • However increases potential for further collisions and can promote clustering
  • no known association between key and its location so linear search needs to be used to find item
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is clustering in hash tables?

A

Nearby indexes are not randomly or evenly distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is separate chaining?

A

Storing a list at each location in the hash table, rather than single key value pairs.

When item is inserted into hash table, it is appended to the list stored at the location indicated by the hash value of the key.

Allows multiple key-value pairs to be stored at the same hash table index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is rehashing?

A

Process of increasing the size of a hash table and redistributing the elements to new indexes based on their new hash values

reduces likelihood of collisions as there are now more buckets/locations for key value pairs to be in

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the uses of hash tables?

A
  • Databases: quick storage and retrieval of data based on primary keys
  • Cache memory: determining the memory address for a given item of data
  • Operating Systems: store locations of its programs and utilities for quick access
    (Encryption)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do hash tables work?

A
  • When an item is added, its key is processed by a hashing algorithm that returns a hash value
  • Items are placed in indexes specified by the hash value
  • To search for an item, key is processed again and the resulting index is checked for the item, if it does not exist we can confidently say it is not in the table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the advantages of using hash tables to store data?

A
  • Faster to lookup a record
  • Time complexity of O(1) for retrieval
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the advantages of storing data in a dictionary?

A
  • More compact file as there are no empty records
  • Produce a sorted list
  • Do not have to design a hash algorithm so easier to design
  • No need to deal with collisions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How are records added to a hash table?

A
  • Hashing algorithm is applied to a key
  • Result is the location in the table where the record should be stored
  • If location is not empty then use next free location
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Benefits of separate chaining over linear probing

A

You can keep adding items without causing clustering or increasing likelihood of collisions

Key value pairs can still be found in known locations - no need search whole table to find item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Drawbacks of separate chaining

A

The appropriate list still needs to be linear searched when accessing an item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When is rehashing performed

A

When a hash table load factor exceeds a predetermined load threshold.

The load factor is the ratio of stored key-value pairs to locations/buckets within the hash table.