4.2.6 Hash Tables Flashcards

(18 cards)

1
Q

Define Hash Tables.

A

A data structure where a hashing algorithm creates a mapping between keys and values. The data item can then be directly accessed by recalculation, without any search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define Hashing Algorithms.

A

An algorithm that calculates a value to determine the unique index where a data item is to be stored in a hash table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is hash table static or dynamic? And what data structure is used?

A

Uses an array - static because you have to be able to access each position of the array directly by specifying index position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 2 types of hashing algorithms?

A
  • for storing passwords/important text
  • gives an index that corresponds to hash table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How are hashing algorithms used in storing passwords?

A

Password hash is one way function. So when the password is entered the first time, it is hashed and saved, only the hashed password is saved in the records.
When the user logs in, the password is hashed and compared to the saved value, if it is the same then access is granted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is hashing algorithm used in giving index positions in hash tables?

A

The first time the value is inputted, the hash key along with the corresponding data is pushed into the table.

So when the value is input, which is the hash key, it will be pushed into the hashing algorithm, and converts it to a hash value.
This hash value is the index position in the hash table, and the data that is stored in that index position corresponds to the hash key value.

It can also store the value in that index position if it is empty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are some main requirements of a hash function?

A
  • always produces the same hash value for the same key
  • provide a uniform distribution of hash values, this means that every value has an equal probability of being generated
  • minimize cluttering; arises when many different keys produce the same hash value
  • be very fast to compute
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define Collisions in terms of hashing.

A

The phenomenon when two key values compute to the same hash.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define Rehashing.

A

The process of rerunning the hashing algorithm in the event of a collision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 2 ways to resolve collisions in hashing?

A
  • Linear probing
  • Chaining
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why will collisions arise?

A

Some bad hashing algorithms only use part of the value to generate the hash key so it creates collisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the process of linear probing.

A

Linear probing or open addressing is a way of resolving collisions.
If the key creates a hash value that references a position that is already occupied, the data is placed in the next free position in the hash table, this is continued until a free position is found.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What problem does linear probing lead to?

A

Clustering.
The cluster may continue to grow, making the process inefficient. As lots of positions are needed to search through until found.
As the previous one that is moved may be in the position of a new one that is needed to be added.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe the process of chaining.

A

Chaining or closed addressing is another way to resolve collisions.
This uses linked lists as the data structure within the hash table, and eliminates clustering.
Each data item is stored as a node with 3 attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What 3 attributes are stored in chaining within the data item?

A
  • key value
  • the data
  • a pointer to the next node
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the significance of the pointer in chaining?

A

The first value to be stored will have a null pointer as it is the only item in the list.
When there is a collision, the pointer for the first node at that location will be updated to point to a new node.
This creates a chain of nodes whose key resolve to the same hash value.

17
Q

Why are linked lists used in chaining?

A

Linked lists are much more efficient than linear probing.
Although they need to search through the whole list from the beginning, they are much more efficient if there is a large hash table and lots of linked lists within.
The hashing function still gives repeated values, but there is less to search through compared to linear probing.

18
Q

Describe the process of rehashing.

A

Over time, items will be added and deleted from the hash table. If the hash table starts to fill up, or a large number of items are in the wrong place, the performance of the hash table will degrade.
When rehashing, a new hash table is created (larger if needed). The key for each item in the existing item will be rehashed and the item will be inserted into the new hash table.