Indexing Flashcards

1
Q

What were the older methods used ?

A

we need a more efficient way of using indexes to retrieve data.

Hash: uses hash function of a set of hash fields. Allows direct access if hash fields are known.

Heap: No specific structure, uses linear search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an index ?

A

It is a data structure that allows DBMS to locate particular records in a file in less time.
Faster response to user queries
Can speed up retrieval of records if requirements are on the condition that it to be met.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an index access ?

A

Is associated with particular search key
and has records consisting of a key value and address of logical record in file CONTAINING key value.

VALUES in index are usually sorted/ordered acc to indexing field.

When index is ordered we ca perform BINARY SEARCH.

we can have more than one index field.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two types of files ?

A

DATA:contains logical records

INDEX: Contains index records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the characteristics of indexes ?

A

Access types: the methods that can be supported efficiently.

Access time: the time needed to locate result set.

Insertion/deletion efficiency: how fast it can be done.

Storage overhead: additional storage required in index structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is primary index ?

A

If data is sequentially ordered and indexing field is a KEY FIELD TO THE FILE (unique).

Built for a data file sorted on its key field.

Index file is a sorted file whose records are fixed in length and has two fields:
First field is same data type as ORDEREING KEY FIElD -PRIMARY KEY
Second field is a pointer to a disk block.

One entry for each block in the data file.

First record in each block is called ANCHOR RECORD.

SPARSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is the performance of the primary index ?

A

Requires significantly FEWER BLOCKS , smaller in size that data field record

Binary search on index file requires fewer block accesses than data file.

Insertion and deletion is problematic and you have to change index entries,

Storage overhead is not a serious problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is clustering indexes ?

A

If the data file is sequentially ordered on anon-key files and indexing filed corresponds to non-key.

Built for a data file sorted on a non-key field.

Index file is another sorted file whose records are fixed length with two fields:
first is of same data type as clustering field of data file.
Second is a pointer to disk blocks.

One entry for each distinct value and a pointer to the first block in data file that holds at least one record with the value of clustering field.

SPARSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is the performance of clustering index ?

A

Requires significantly FEWER BLOCKS , smaller in size that data field record

Binary search on index file requires fewer block accesses than data file.

Insertion and deletion is problematic. We have to move records in data file

Storage overhead is not a problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is secondary index ?

A

An index defined on NON-ORDERING file of the data file.
A file can have one physical ordering field. It can either have primary / clustering not both.

It can have several secondary indexes as it doesn’t have any physical affect.

INdex file is SORTED file whose records are fixed/ variable in length with two fields:
First is same data type as indexing field
Second is a pointer to a disk block.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define the case 1 of secondary index and its performance

A

This uses a DENSE(one entry for every record in data file) secondary index that maps to all records in a file.

When indexing filed is NOT ordering field then secondary index is on it where index field- SECONDARY INDEX.

One entry in index file for each entry in data file.

Each entry contains value of 2nd key for record and a pointer to the block/ record.
There may be DUPLICATE VLAUES IN INDEX FIELD.

BINARY SEARCH.

Needs more storage space and longer search items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define case 2 OF SECONDARY INDEX

A

Many records in data file have same value for indexing field.

User variable length records to hold an array of block pointers associated with indexing value field.

Use a single entry for each indexing field value. Extra level of redirection - multiple pointer.

SPARSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is sparse and dense index ?

A

Sparse- has an index record for some of the search key values.

Dense: has an index record for every search key value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are multi-level indexes ?

A

When an index file becomes large the search time increases, that is what this index tries to fix by :

treating index like any other file.

Split index into number of smaller indexes.

Maintain an index to the indexes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is the performance of multi-level ?

A

Search performance increases when searching for specific indexing field value.

Problems with insertion and deletion- for this problem SOME SPACE IN EACH BLOCK IS LEFT FOR INSERTING NEW ENTRIES.
This technique is called DYNAMIC MULTI-LEVEL and implemented by BALANCED TREE(B and b+ trees)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain the tree-data structure.

A

DEPTH of the tree is the maximum no of levels b/w ROOT AND LEAF NODE in tree.

If depth from root to leaf is same to each leaf- BALANCED TREE OR B-TREE.

DEGREE(ORDER)- max number of children allowed per parent.

One more than max number of key values per node.

Access time depends on depth. It is better to be a leafy shallow tree.

When node reaches max size the median is promoted to higher node and left & right tress are split surrounding the median.

17
Q

What are search trees ?

A

A special type of tree- search of a record.

Each block of entries-NODE

node- certain no of pointer and key values.

index value fields-GUIDES US TO THE NEXT NODE until we reach data block with our record.

Using a pointer will IGNORE ALL OTHER NODES that are not in the sub-tree.

18
Q

Difference between B- and B+ trees

A

B+ tree doesn’t allow storage of indexes anywhere else other than the leaves.

INternal nodes in B_ tree-indexes and pointers to other indexes.

19
Q

How is the performance of these trees ?

A

SEARCH takes more time in B_ tress because keys are not just available on the leaves.

B+ tree can maintain duplicates.

B-Tree insertion takes more time while B+ takes same time always.

Deletion of B_ tree is complex while B+ is easy as all indexes are at the leaves.

B_tree no redundant search keys but B+ may have it.