Indexes Flashcards
(10 cards)
What is the difference between a Heap table and a Clustered table?
A Heap table is a table in which, the data rows are not stored in any particular order within each data page. In addition, there is no particular order to control the data page sequence, that is not linked in a linked list. This is due to the fact that the heap table contains no clustered index.
A clustered table is a table that has a predefined clustered index on one column or multiple columns of the table that defines the storing order of the rows within the data pages and the order of the pages within the table, based on the clustered index key.
Explain how the SQL Server Engine uses an Index Allocation Map (IAM)?
IAM used with heap table to keep entry for each data page to track the allocation of the available pages, it’s the logical connection between the data pages that the SQL server engine used to move through the heap.
The heap is composed of IAM that points to all pages within the heap
What is the “Forwarding Pointers issue” and how can we fix it?
When a data modification operation is performed on heap table data pages, Forwarding Pointers will be inserted into the heap to point to the new location of the moved data. These forwarding pointers will cause performance issues over time due to visiting the old/original location vs the new location specified by the forwarding pointers to get a specific value.
Starting from SQL Server version 2008, a new method was introduced to overcome the forwarding pointers performance issue, by using the ALTER TABLE REBUILD command, that will rebuild the heap table.
What is a SQL Server Index?
Indexes are created to speed up the data retrieval and the query processing operations from a database table or view, by providing swift access to the database table rows, without the need to scan all the table’s data, in order to retrieve the requested data.
Describe the structure of a SQL Server Index that provides faster access to the table’s data?
- Index is created using the shape of B-tree, that made up of 8k pages size.
- Each page called index node.
- The B-tree provide a fast way to move through the table rows based on index key.
- the B-tree structure of the index consists of three main levels:
1- Root level: the top node that contains a single index page, from it sql server start its search.
2- Leaf level: the bottom level of the tree that contains the data pages of the actual table rows.
3- Intermediate level: one or multiple levels between root and leaf levels, that holds the index value and the pointer to the next intermediate level or to the leaf node.
Explain Index Depth, Density and Selectivity factors and how these factors affect index performance?
- Index depth is the number of levels from the index root node to the leaf nodes. An index that is quite deep will suffer from slow performance issue. An index with only 3 to 4 levels is very common.
- Index density is a measure of the lack of uniqueness of the data in a table. A dense column is one that has a high number of duplicates.
- Index selectivity is a measure of how many rows scanned compared to the total number of rows. An index with high selectivity means a small number of rows scanned when related to the total number of rows.
Why it is not recommended to create indexes on small tables?
It takes the SQL Server Engine less time scanning the underlying table than traversing the index when searching for specific data. In this case, the index will not be used but it will still negatively affect the performance of data modification operations, as it will be always adjusted when modifying the underlying table’s data.
What are some different ways to create an index?
- CREATE INDEX T-SQL statement.
- Using SQL Server Management Studio.
- Indirectly by defining the PRIMARY KEY and the UNIQUE constraint within the CREATE TABLE or ALTER TABLE statements.
How many Clustered indexes can be created on a table and why?
only one Clustered index per each table, as the data can be sorted in the table using only one order criteria.
Describe the characteristics ideal Clustered index keys.
- Short: Although SQL Server allows us to add up to 16 columns to the clustered index key, with a maximum key size of 900 bytes, the typical clustered index key is much smaller than what is allowed, with as few columns as possible. A wide Clustered index key will also affect all non-clustered indexes built over that clustered index, as the clustered index key will be used as a lookup key for all the non-clustered indexes pointing to it.
- Static: It is recommended to choose columns that are not changed frequently in a clustered index key. Changing the clustered index key values means that the whole row will be moved to the new proper page to keep the data values in the correct order.
- Increasing: Using an increasing (aka incrementing) column, such as the IDENTITY column, as a clustered index key will help in improving the INSERT process, that will directly insert the new values at the logical end of the table. This highly recommended choice will help in reducing the amount of memory required for the page buffers, minimize the need to split the page into two pages to fit the newly inserted values and the fragmentation occurrence, that required rebuilding or reorganizing the index again.
- Unique: It is recommended to declare the clustered index key column or combination of columns as unique to improve query performance. Otherwise, SQL Server will automatically add a uniqueifier column to enforce the clustered index key uniqueness.
- Accessed frequently: This is due to the fact that the rows will be stored in the clustered index in a sorted order based on that index key that is used to access the data.
- Used in the ORDER BY clause: In this case, there no need for the SQL Server Engine to sort the data in order to display it, as the rows are already sorted based on the index key used in the ORDER BY clause.