12 Physical Representation: Indexing and Hashing Flashcards

(57 cards)

1
Q

What are access methods?

A

A group of programs that allow operations to be applied to a file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the purpose of indexing mechanisms?

A

To speed up the access to desired data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an index?

A

A data structure that allows us to locate information faster than with sequential scan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a search key?

A

An attribute or set of attributes used to look up records in the file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does an index file contain?

A

Records called index entries in the form (SearchKey, Pointer)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do index entries compare in size to the original file?

A

They’re much smaller than the original file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give an example of a search key.

A

studentID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does a pointer indicate in an index entry?

A

Where to look in the original data file, e.g., ‘block 5, position 3’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the two basic kinds of indices?

A
  • Ordered indices
  • Hash indices
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How are ordered indices structured?

A

Search keys sorted by search key value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How are hash indices structured?

A

Search keys distributed uniformly across buckets using hash function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a hash function?

A

A function always applied to the search key to determine bucket placement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Provide an example of a hash function application.

A

hash(StudentID) = 105 % 3 = 0, so add record to bucket 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What determines the block a record will go in when using hash indices?

A

Bucket placement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is record position inside a block usually determined?

A

Based on sequential insertion (first inserted is first)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the main evaluation metrics for an index?

A

Accuracy, efficiency, access time, insertion time, deletion time, space overhead.

These metrics help determine the effectiveness and performance of an indexing system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the tradeoff associated with indexing?

A

Faster access vs more storage used.

This tradeoff highlights the balance between speed of retrieval and the storage space required for the index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a primary index?

A

An index that specifies the sequential order of a file, typically using the primary key.

It is also known as a clustering index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is an index-sequential file?

A

An ordered sequential file that has a primary index.

This type of file structure allows for efficient data retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a secondary index?

A

An index used when data is not stored in a sequential order that matches the search key.

It is also referred to as a non-clustering index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does a secondary index differ from a primary index?

A

A secondary index points to a bucket rather than directly to records.

This can lead to increased storage overhead.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a dense index?

A

An index with one entry for every search key value in the file.

If a dense index has fewer entries than the original table, the search key cannot be the primary key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a sparse index?

A

An index that contains records for only some search key values.

Sparse indices are efficient in terms of space and maintenance.

24
Q

What is the advantage of using a sparse index?

A

Requires less space and less maintenance than dense indices.

However, it is generally slower for locating records.

25
How do you locate a record with a search key value K using a sparse index?
Find the index record with the largest search key value < K. ## Footnote This method allows you to efficiently navigate to the desired record.
26
What is the main difference between a dense index and a sparse index?
Dense index allows faster retrieval but requires more space and maintenance. Sparse index is slower in retrieval but requires less space and maintenance.
27
What is a block in the context of a sparse index?
A block is a fixed unit of storage on disk that holds multiple records or index entries in a database system.
28
What solution can be used if the primary index does not fit in memory?
Create a sparse index on the primary index itself to reduce the amount of data that needs to be loaded into memory.
29
What is the outer index in the context of sparse indexing?
The outer index is a sparse index of the primary index.
30
What is multilevel indexing?
Multilevel indexing is creating an outer index for the outer index if it is still too large.
31
What must happen to indices at all levels during insertion or deletion?
Indices at all levels must be updated on insertion/deletion from the original file.
32
What is the root node in a B+ tree?
The root node is the topmost node of the tree and serves as the starting point for all search operations.
33
What are internal nodes in a B+ tree used for?
Internal nodes are used for indexing and routing search operations to the correct leaf node.
34
What do leaf nodes in a B+ tree contain?
Leaf nodes contain actual data records or pointers to data records.
35
What is a limitation of indexed-sequential files?
Performance degrades as the file grows due to overflow blocks, requiring periodic reorganization.
36
What is an advantage of B+ Trees over indexed-sequential files?
B+ Trees provide automatic reorganization and performance maintenance.
37
How does a B+ tree maintain efficiency as the file grows?
It is a self-balancing tree structure that automatically reorganizes itself with each insertion and deletion.
38
Where is data inserted in a B+ tree?
Data is inserted at leaf nodes.
39
What is a minor disadvantage of B+ trees?
The restructuring of the tree requires some extra overhead as nodes are split or merged.
40
What extra space is required in B+ trees?
Extra space is required because each internal node points to child nodes and each leaf node holds pointers to actual data records.
41
What properties does a B+ Tree satisfy?
1. All paths from root to leaf are of the same length. 2. Each non-root, non-leaf node has between [n/2] and n children. 3. A leaf node has between [(n-1)/2] and n-1 values.
42
What is the maximum branching factor in a B+ Tree?
n, which is the maximum number of children a node can have.
43
How can databases use indexes for a query with multiple conditions?
1. Use index on the first condition to filter records, then apply the second condition. 2. Find pointers for both conditions and intersect the results.
44
What does a composite search key mean?
An index built on more than one column, for example, (branch_name, balance).
45
When is an index on multiple attributes efficient?
1. When the query involves exact values for all indexed columns. 2. When it involves an exact value for the first column and a range for the following column.
46
When is an index on multiple attributes inefficient?
When the query involves a range for the first column and an exact value for the following column.
47
What is static hashing?
A method to quickly find data in a file using a hash function.
48
What is a bucket in static hashing?
A unit of storage containing one or more records, typically a disc block.
49
What is the purpose of a hash function in static hashing?
To convert a search key into a bucket address, allowing direct access to the corresponding bucket.
50
What is the cost of periodic reorganization in ordered indexing and hashing?
It depends on how often data is inserted or deleted, as both methods require reorganization.
51
When should hashing be preferred over ordered indexing?
For fast average performance and when searching by exact key.
52
When is ordered indexing better than hashing?
For predictable worst-case time performance and for range queries.
53
What does PostgreSQL support regarding hash indices?
PostgreSQL supports hash indices but discourages their use due to poor performance.
54
How can you create an index in SQL?
Using the command: CREATE INDEX index_name ON relation_name (attribute_list).
55
What is the difference between hash organization and hash index?
Hash organization stores records directly in a file using hashing, while hash index maps search keys to the location of records stored elsewhere.
56
What does the command CREATE UNIQUE INDEX specify?
It specifies that the search key is a candidate key.
57
How do you drop an index in SQL?
Using the command: DROP INDEX index_name.