12 Physical Representation: Indexing and Hashing Flashcards
(57 cards)
What are access methods?
A group of programs that allow operations to be applied to a file
What is the purpose of indexing mechanisms?
To speed up the access to desired data
What is an index?
A data structure that allows us to locate information faster than with sequential scan
What is a search key?
An attribute or set of attributes used to look up records in the file
What does an index file contain?
Records called index entries in the form (SearchKey, Pointer)
How do index entries compare in size to the original file?
They’re much smaller than the original file
Give an example of a search key.
studentID
What does a pointer indicate in an index entry?
Where to look in the original data file, e.g., ‘block 5, position 3’
What are the two basic kinds of indices?
- Ordered indices
- Hash indices
How are ordered indices structured?
Search keys sorted by search key value
How are hash indices structured?
Search keys distributed uniformly across buckets using hash function
What is a hash function?
A function always applied to the search key to determine bucket placement
Provide an example of a hash function application.
hash(StudentID) = 105 % 3 = 0, so add record to bucket 0
What determines the block a record will go in when using hash indices?
Bucket placement
How is record position inside a block usually determined?
Based on sequential insertion (first inserted is first)
What are the main evaluation metrics for an index?
Accuracy, efficiency, access time, insertion time, deletion time, space overhead.
These metrics help determine the effectiveness and performance of an indexing system.
What is the tradeoff associated with indexing?
Faster access vs more storage used.
This tradeoff highlights the balance between speed of retrieval and the storage space required for the index.
What is a primary index?
An index that specifies the sequential order of a file, typically using the primary key.
It is also known as a clustering index.
What is an index-sequential file?
An ordered sequential file that has a primary index.
This type of file structure allows for efficient data retrieval.
What is a secondary index?
An index used when data is not stored in a sequential order that matches the search key.
It is also referred to as a non-clustering index.
How does a secondary index differ from a primary index?
A secondary index points to a bucket rather than directly to records.
This can lead to increased storage overhead.
What is a dense index?
An index with one entry for every search key value in the file.
If a dense index has fewer entries than the original table, the search key cannot be the primary key.
What is a sparse index?
An index that contains records for only some search key values.
Sparse indices are efficient in terms of space and maintenance.
What is the advantage of using a sparse index?
Requires less space and less maintenance than dense indices.
However, it is generally slower for locating records.