Week 10 - Hashtable Flashcards
What is a hash table?
It an abstract data type that can map Keys to Values (key-value pairs).
What other use does a hash table have?
It uses a hash function to calculate the index of a specific key and it use hash code to locate the associated Value in an array of buckets or slots.
What are the 3 operations that Hash Table support?
- Insert
- Delete
- Lookup
What is O(1) time complexity in the context of a hash table?
O(1) time complexity means constant time operation, where the time it takes to perform an operation (like insertion, deletion, or lookup) is independent of the number of elements in the hash table.
How is O(1) time complexity achieved in a hash table?
O(1) is achieved by using a hash function to map keys to unique index locations in an array, allowing direct access to the data without needing to search through all elements.
What role does a hash function play in achieving O(1) time complexity in a hash table?
A hash function computes a hash code for each key, which is used to directly map to an index in the table’s underlying array, enabling constant time access for insertions, deletions, and lookups.
How can collisions affect O(1) time complexity in hash tables?
Collisions (when multiple keys map to the same index) can reduce performance. However, techniques like chaining (linked lists) or open addressing (probing) can resolve collisions, maintaining O(1) average time complexity.
Why is resizing important for maintaining O(1) performance in hash tables?
Resizing ensures that the load factor (ratio of elements to the table size) remains low. A high load factor increases the chances of collisions, affecting performance, so resizing (re-hashing) helps maintain O(1) operations.
What is a Direct Address Table (DAT)?
A Direct Address Table (DAT) is a simple data structure that uses an array where the index directly corresponds to the key, allowing for constant time O(1) access to elements. It’s most useful when the universe of possible keys is small.
When is it appropriate to use a Direct Address Table (DAT)?
A DAT is ideal when the set of possible keys is small and known in advance, as it provides fast O(1) lookups, insertions, and deletions. It’s not efficient for large key spaces due to space complexity.
How does a Direct Address Table (DAT) handle keys?
In a DAT, an array is created where each index corresponds directly to a key. For example, if keys are integers from 0 to 1000, the array will have 1001 elements, and the key value directly maps to the index of the array.
How does the size of the key space affect the efficiency of a Direct Address Table (DAT)?
If the key space is large but only a small subset of keys is used, the table will still consume memory for the entire range of keys, leading to space inefficiency. This results in high memory usage, especially for sparse datasets.
What is a key problem of using a Direct Address Table (DAT) when the key space is large?
A Direct Address Table (DAT) requires an array size equal to the range of possible keys, so if the key space is large (e.g., 0 to 1,000,000), it leads to inefficient space usage. Most of the array may remain empty, wasting memory.
Can a Direct Address Table (DAT) process key-value pairs like a hash table or map?
No, a Direct Address Table (DAT) is not designed to store arbitrary key-value pairs. It maps keys directly to table indices, and each index typically stores a single value (e.g., true/false, presence/absence). It doesn’t allow for complex associations or multiple values for a single key like in hash maps or dictionaries.
What is the main purpose of a Hash Function?
The main purpose of a Hash Function is to generate Hash Codes from Keys, which are then used to index entries in a Hash Table.
What are the key properties of a good hash function?
A good hash function should be:
Deterministic: Equal keys must produce the same hash value.
Efficient: It should be fast to compute.
Uniformly Distribute: It should spread keys evenly across the hash table to minimize collisions.
What does deterministic mean in the context of a hash function?
Deterministic means that for any given key, the hash function will always produce the same hash value every time it is applied. Equal keys must produce the same hash code.
Why is it important for a hash function to uniformly distribute keys?
A uniform distribution ensures that the keys are spread evenly across the hash table, reducing the likelihood of collisions. This helps maintain efficient operations (insertions, deletions, and lookups) with constant time complexity.