Indexing Flashcards

Question

What are the constraints on the number of child nodes in every internal node of a B-Tree?

Answer 1

Every internal node in a B-Tree has between ⌊B/2⌋ and B child nodes, unless it is the root. The root, in turn, has between 2 and B child nodes. These constraints contribute to the overall balance of the B-Tree structure.

Answer 2

If node p is the parent of node u in a B-Tree, then p stores a routing element. This routing element equals the smallest element stored in the leaf nodes of the subtree rooted at u. The routing element helps in navigating the tree efficiently during search operations.

Answer 3

The first step is to find the leaf node u where the element e should be inserted. The goal is to locate the appropriate position for the new element in the tree.

Answer 4

The decision is typically made based on a comparison of the value of the new element e with the values stored in the nodes of the B-Tree. The search process aims to find the appropriate leaf node where the element should be inserted to maintain the order of elements in the tree.

Answer 5

In the second step, the element e is added to the leaf node u. This step involves inserting the new element into the leaf node at the determined position.

Answer 6

Overflow occurs when the leaf node u, after inserting the new element e, has more than B elements. In this scenario, the node is considered to be overflowing, and further steps are taken to address the overflow situation.

Answer 7

If the leaf node u overflows, meaning it has more than B elements, it is split into two nodes u1 and u2. The elements are distributed evenly between u1 and u2. If u is the root, a new root is created with u1 and u2 as the child nodes. If u is not the root, and p is the parent of u, then u1 and u2 become the child nodes of p, effectively redistributing the elements and maintaining the balance of the B-Tree.

Answer 8

The first step is to find the leaf node u where the element e is stored. The goal is to locate the node that contains the element to be deleted.

Answer 9

In the second step, the element e is removed from the leaf node u. This involves deleting the specified element from the leaf node.

Answer 10

Underflow occurs when the leaf node u, after deleting the element e, has fewer than B/2 elements. In this scenario, the node is considered to be underflowing, and further steps are taken to address the underflow situation.

Answer 11

If the leaf node u underflows, meaning it has fewer than B/2 elements, it is merged with a neighboring sibling. If the merged node has more than B elements, a split operation is performed. The child nodes of the parent p are adjusted accordingly to maintain the balance of the B-Tree.

Answer 12

Node p is removed if it is the root and has only one child node left. Additionally, if p is not the root but underflows during the deletion process, it is handled in a similar fashion, potentially leading to its removal. This ensures the maintenance of the B-Tree structure.

Answer 13

The adjustment of child nodes of the parent p involves updating the structure of the B-Tree to reflect changes in the child nodes. This adjustment is performed to maintain the balance and integrity of the B-

Answer 14

Each node in the B-Tree structure is stored in a block, and the size of the block (denoted as B) depends on the block size of the storage system. Each block contains a set of elements and pointers (links), where each pointer is a block address.

Answer 15

The figure represents a B-tree structure built on the pid attribute of the PROF table. It shows several levels of nodes (u1 to u8), where each node contains a set of elements. The elements represent values from the pid attribute, and each element stores a pointer to the block where the corresponding tuple with that pid resides.

Answer 16

The height of a B-tree is defined as the maximum level in the tree structure. In the provided example, the levels are denoted as Level 0, Level 1, and Level 2. The root of the tree is at Level 0, and the height is determined by the maximum level in the tree.

Answer 17

In a B-tree, all elements are stored only in leaf nodes. Leaf nodes are the nodes at the bottom level of the tree that contain the actual data or values.

Answer 18

The significance of the root being at Level 0 is that it is the topmost node in the hierarchy. Level 0 represents the first level above the leaf nodes, and the root serves as the starting point for traversing the B-tree.

Answer 19

Most RDBMS automatically build an index for each primary key. This ensures efficient retrieval and management of records based on the primary key.

Answer 20

An index is used to enforce the uniqueness constraint in RDBMS. When a unique keyword is specified for an attribute, the RDBMS creates an index for that attribute to ensure that no duplicate values are allowed.

Answer 21

Example: create index prof_index on prof (sal). In this example, sal is a non-key attribute for which an index named prof_index is created.

Answer 22

A multi-column index is used to index multiple columns together, allowing for efficient retrieval based on the combined values of those columns. It is especially useful for queries that involve conditions on multiple attributes.

Answer 23

The ordering of columns matters when creating a multi-column index. For example, create index CFIndex on staff (floor, room) is different from create index CFIndex on staff (room, floor). The order determines the sequence of values in the index and affects the efficiency of queries depending on the order of conditions in the WHERE clause.

Answer 24

A clustered index requires the data records to be physically sorted on the disk based on the indexed column. This physical ordering of data provides certain performance advantages for specific query types.

Answer 25

While a clustered index can offer quicker query performance due to the physical ordering of data, it comes with higher update overhead. This is because any update to the data may require the rearrangement of physical storage to maintain the sorted order.

Answer 26

A clustered index is typically considered for the primary key of a table. It is used when the physical ordering of data based on the primary key can significantly benefit certain query types.

Answer 27

A clustered index might be chosen when certain query types, especially those involving range queries, benefit from the physical sorting of data on the disk. However, this decision comes with the trade-off of higher update overhead.

Answer 28

Big-O notation is used to describe the performance characteristics of algorithms and data structures. In the context of clustered vs non-clustered indexes, different query types may have different Big-O complexities based on the choice of index. Understanding these complexities is essential for optimizing database performance for specific types of queries.

Answer 29

The choice between clustered and non-clustered indexes is a frequent interview question because it requires a deep understanding of database performance considerations. Interviewers may ask candidates about the benefits, drawbacks, and scenarios where each type of index is appropriate to assess their knowledge of database optimization.

Indexing Flashcards

(53 cards)