.121 10-15 Flashcards
(53 cards)
What is indexed retrieval?
using an identifier to store + retrieve each object/record
What can be used as the identifier/key in indexed retrieval?
an unique attribute of each object (e.g. alphabetically ordering by name)
Why is indexed retrieval via keys helpful?
allows us to perform binary search (objects will be sorted by the given key)
How does indexed retrieval effect search efficiency?
linear search can be 26x faster if ordered alphabetically and all sections are equal
binary can be 26x faster (same as above)
worst case for both - if all objects are in 1 section (will be the usual time complexity for both algorithms)
What is a problem with storing objects?
collisions
Solutions to avoid collisions
indexed retrieval + chains
How does indexed retrieval reduce collisions?
when inserting extra data the index array is recomputed by shifting other records to make sure thereβs room for the object
How does using chains reduce collisions?
if alphabetical - store all objects beginning w/ the letter in an array for that letter where each element points to the appropriate chain
chains are dynamic (use pointers) so easy to insert extra objects w/o any recomputation of indexes!!
Efficiency of chains:
much faster (where objects distributed across many chains)
- same improvement as w/ using arrays
worst case - every object is in the same chain so O(N)
Explain hash tables:
data struct. that converts a key to an int which becomes the index for storing the key-value pair
What is the key of an element in a hash table?
the unique part/identifer
How is a key turned to an index for a hash table?
hash function takes the key to generate a hash code
same deterministic function to store + retrieve
β:π β{0,1,2,β¦,π β1} - h = hash function, U = universe of keys, m = array size
What makes a hash function good?
- produces a wide range of index values (helps minimise collisions)
- unform hashing - each key is equally likely to map to an array location
What operations are helpful for hashing?
- multiplication/division = helps create a wider range of indexes
- MOD - can % by array size to ensure index is within array bounds (helpful after multiplication)
- logical shifts + OR operations - can replace multiplication/division for better speeds
What is a key-value pair?
each element of data is associated with a unique identifier
must store both to distinguish between pairs with the same hash
hash table as an ADT:
- stores <k,v> pairs (where keys are unique + a handle to the associated value
- put (t, <k,v>) - to store <k, v> in table t
- get (t, k) - returns <k,v> if k is in t
- delete (t, k)
Explain rehashing:
used to make the table efficient again once chains get too long
create a new hash table at least double the current size
move all the key-value pairs over
though invisible to the ADT user, it can be an expensive operation
Why are hash tables useful?
designed to be fast to work w/
- when few collisions (collisions are resolved by chaining) = O(1)
Define an algorithm:
set of computational steps that transform the input β the output
How can algorithms be represented?
code, flow charts, sentences, punch cards + pseudocode
What is a greedy algorithm?
one that optimises for the current situation/moment without care for the best option for the future
What are the complexities?
time complexity - no. of steps relative to input size
may also be measured as actual running time (not as accurate)
space complexity - amount of memory required to run an algorithm to completion
space + time = bounded resources
What features are important when designing an algorithm?
correct, robust, simple, and space/time complexity
What is space complexity separated into?
fixed part - memory required to store certain data independent of the size of the problem
e.g. constants
variable part - space needed by variables so dependent on problem size
e.g. user inputted variables