2017 level 6 Exam Flashcards Preview

Database Systems > 2017 level 6 Exam > Flashcards

Flashcards in 2017 level 6 Exam Deck (45):

What is data redundancy, and which characteristics of the file system can lead to it?

Data redundancy exists when unnecessarily duplicated data is found in the database. E.g a customers phone number found in the customer file, sales,agent file, and the invoice file. Data redundancy is symptomatic of a (computer) file system, given its inability to represent and manage data relationships. Data redundancy
may also be the result of poorly designed databases that allow the same data to be kept in different locations.


What is data independence, and why is it lacking in file systems?

Data independence exists when the programs that access data are not dependent on the
data storage characteristics of the data.. File systems lack data independence because all data access
programs are subject to change when any of the file system’s data storage characteristics – such as changing a data type -- change.


What is a DBMS, and what are its functions?

A DBMS is a collection of programs that manage the database structure and that control shared access to the data in the database.
The functions of a DBMS:

 The DBMS stores the definitions of data and their relationships (metadata) in a data dictionary; any changes made are automatically recorded in the data dictionary.

 The DBMS creates the complex structures required for data storage.

 The DBMS transforms entered data to conform to the data structures in item 2.

 The DBMS creates a security system and enforces security within that system.

 The DBMS creates complex structures that allow multiple user access to the data.

 The DBMS performs backup and data recovery procedures to ensure data safety.

 The DBMS promotes and enforces integrity rules to eliminate data integrity problems.

 The DBMS provides access to the data via utility programs and from programming languages interfaces.

 The DBMS provides end-user access to data within a computer network environment.


What is structual independence, and why is it important?

Structural independence exists when data access programs are not subject to change when the files
structural characteristics, such as the number or order of the columns in a table, change. Structural independence is important because it decreases programming effort and program
maintenance costs.


Explain the difference between data and information.

Data are raw facts. Information is processed data to reveal the meaning behind the facts.

 Data is the building blocks of information.

 Information is produced by processing data.

 Information is used to reveal the meaning of data.

 Good, relevant, and timely information is the key to good decision making.


What is the role of a DBMS, and what are its advantages? What are its disadvantages?

A database management system (DBMS) is a collection of programs that manages the database structure and controls access to the data stored in the database.
- Improved data sharing.
- Better data integration.
- Minimized data inconsistency.
- Improved data access.
-Improved decision making.
- Increased end-user productivity.

-Increased costs.
-Management complexity.
-Maintaining currency.
-Vendor dependence.
-Frequent upgrade/replacement cycles.


What are the main components of a database system?

hardware, software, people, procedures, and data.


What are metadata?

data about data e.g defines the data characteristics such as the data type(such as character or numeric) and the relationships that link the data.


Explain why database design is important.

A good application can not work with a badly designed database


What are some basic database functions that a spreadsheet cannot perform.

Spreadsheets do not support self-documentation through metadata, enforcement of data types or
domains to ensure consistency of data within a column, defined relationships among tables, or
constraints to ensure consistency of data across related tables.


What common problems do a collection of spreadsheets created by end users share with the
typical file system?

A collection of spreadsheets shares several problems with the typical file system.
end users create their own copies of the data, which creates issues of data ownership.
This situation also creates islands of information where changes to one set of data are not reflected in all of
the copies of the data. This leads to lack of data consistency. Because the data in
various spreadsheets may be intended to represent a view of the business environment, a lack of
consistency in the data may lead to faulty decision making based on inaccurate data.


What is a business rule, and what is its purpose in data modeling?

A business rule is a brief, precise, and unambigous description of a policy, procedure, or principle
within a specific organization’s environment.
They are used to define entities, attributes, relationships, and constraints.

Because these components form the basis for a database design, the careful derivation and
definition of business rules is crucial to good database design.


What is a table, and what role does it play in the relational model?

In the relational databases designers and users perceive a table to be a matrix consisting of a
series of row/column intersections.Tables, also called relations, are related to each other by sharing
a common entity characteristic.Tables are used to describe the entities they represent, they provide an easy way to summarize entity characteristics and relationships among entities. And, because they are purely conceptual
constructs, the designer does not need to be concerned about the physical implementation aspects of the database design.


What is a relational diagram? Give an example.

A relational diagram is a visual representation of the relational database’s entities, the attributes
within those entities, and the relationships between those entities.


What is the difference between a database and a table?

A table is a logical structure that represents an entity set and is only one of the components of a database.

The database is a structure that houses one or more tables and metadata. Metadata include the data (attribute) characteristics and the relationships between the entity sets.


What does it mean to say that a database displays both entity integrity and referential integrity?

Entity integrity describes a condition in which all rows(tuples) within a table are uniquely identified by their primary key. The unique value requirement prohibits a null primary key value, because nulls are not unique.

Referential integrity means that the foreign key in any referencing table must always refer to a valid row in the referenced table. Referential integrity ensures that the relationship between two tables remains synchronized during updates and deletes.


Why are entity integrity and referential integrity important in a database?

Entity integrity and referential integrity are important because they are the basis for expressing and implementing relationships in the entity relationship model. Entity integrity ensures that each row is uniquely identified by the primary key. Therefore, entity integrity means that a proper search for an existing tuple (row) will always be successful. (And the failure to
find a match on a row search will always mean that the row for which the search is conducted does
not exist in that table.)

Referential integrity means that, if the foreign key contains a value,that value refers to an existing valid tuple (row) in another relation. Therefore, referential
integrity ensures that it will be impossible to assign a non-existing foreign key value to a table.


What two conditions must be met before an entity can be classified as a weak entity? Give an example of a weak entity.

1. The entity must be existence-dependent on its parent entity.
2. The entity must inherit at least part of its primary key from its parent entity.
For example, the (strong) relationship depicted in the text’s Figure 4.10 shows a weak CLASS entity:
CLASS is clearly existence-dependent on COURSE. (You can’t have a database class unless a
database course exists.)
The CLASS entity’s PK is defined through the combination of CLASS_SECTION and
CRS_CODE. The CRS_CODE attribute is also the PK of COURSE.


What is a strong (or identifying) relationship, and how is it depicted in a Crow’s Foot ERD?

A strong relationship exists when en entity is existence-dependent on another entity and inherits at least part of its primary key from that entity. The strong
relationship is shown as a solid line. In other words, a strong relationship exists when a weak entity is related
to its parent entity.


What is a recursive relationship? Given an example.

A recursive relationship exists when an entity is related to itself. For example, a COURSE may be a
prerequisite to a COURSE.


What is an entity supertype, and why is it used?

An entity supertype is a generic entity type that is related to one or more entity subtypes, where the
entity supertype contains the common characteristics and the entity subtypes contain the unique
characteristics of each entity subtype. The reason for using supertypes is to minimize the number of nulls and to minimize the likelihood of redundant relationships.


What kinds of data would you store in an entity subtype?

An entity subtype is a more specific entity type that is related to an entity supertype, where the entity
supertype contains the common characteristics and the entity subtypes contain the unique characteristics of each entity subtype. The entity subtype will store the data that is specific to the entity; that is, attributes that are unique the subtype.


What is a specialization hierarchy?

A specialization hierarchy depicts the arrangement of higher-level entity supertypes (parent entities) and lower-level entity subtypes (child entities).
e.g employees with children of pilot, mechanic and accountant


What is a subtype discriminator? Given an example of its use.

A subtype discriminator is the attribute in the supertype entity that is used to determine to which entity subtype the supertype occurrence is related. For any given supertype occurrence, the value of the subtype discriminator will determine which subtype the supertype occurrence is related to. For example, an EMPLOYEE supertype may include the EMP_TYPE value “P” to indicate the PROFESSOR subtype.


What is an overlapping subtype? Give an example.

Overlapping subtypes are subtypes that contain non-unique subsets of the supertype entity set; that is, each entity instance of the supertype may appear in more than one subtype. E.g at university a person may be an employee or a student or both. In turn, an employee may be a professor as well as an administrator.

This is signalled by the letter O in diagrams


What is the difference between partial completeness and total completeness?

Partial completeness means that not every supertype occurrence is a member of a subtype; There can be some supertype occurrences that are not members of any subtype. Total completeness means that every supertype occurrence must be a member of at least one subtype.


Under what circumstances are composite primary keys appropriate?

Composite primary keys are most useful in two cases:

- As identifiers of composite entities, where each primary key combination is allowed only once in the M:N relationship.

- As identifiers of weak entities, where the weak entity has a strong identifying relationship with the parent entity.


What is a surrogate primary key, and when would you use one?

A surrogate primary key is an “artificial” PK that is used to uniquely identify each entity occurrence
when there is no good natural key available or when the “natural” PK would include multiple attributes. A surrogate PK is also used if the natural PK would be a long text variable. The reason for
using a surrogate PK is to ensure entity integrity, to simplify application development – by making
queries simpler – to ensure query efficiency


What is the most common design trap, and how does it occur?

The most common design trap is known as a fan trap. A fan trap occurs when you have one entity in two 1:M relationships to other entities, thus producing an association among the other entities that is not expressed in the model.


What is normalization?

Normalization is the process for assigning attributes to entities. Properly executed, the
normalization process eliminates uncontrolled data redundancies, which eliminates the data
anomalies and the data integrity problems that are produced by those redundancies.
Normalization does not eliminate data redundancy; instead, it produces the carefully controlled redundancy that lets us properly link database tables.


When is a table in 1NF?

- No nulls
- eliminate repeating groups
- identify primary key
- identify all dependencies
- still contain partial dependencies


When is a table in 2NF?

- no partial dependencies(by making into separate tables)
- table with primary keys for each dependent
- still has transitive primary keys i.e., dependencies based on attributes that are not part of the primary key.


When is a table in 3NF?

A table is in 3NF when it is in 2NF and it contains no transitive dependencies.
Do this by making non-primary key attributes that are relied on into their own tables.


What is an information system? What is its purpose?

An information system is a system that

-provides the conditions for data collection, storage, and retrieval
-facilitates the transformation of data into information
-provides management of both data and information.

An information system is composed of hardware, software (DBMS and applications), the database(s), procedures, and people.

The purpose of an information system is to facilitate good decision making by making relevant and timely information available to the decision makers.


What is the minimal data rule in conceptual design? Why is it important?

The minimal data rule specifies that all the data defined in the data model are actually required to fit present and expected future data requirements. This rule may be phrased as All that is needed is there, and all that is there is needed.


What are business rules? Why are they important to a database designer?

Business rules are narrative descriptions of the business policies, procedures, or principles that are

derived from a detailed description of operations. Business rules are particularly valuable to database

designers, because they help define:

- Entities
- Attributes
- Relationships (1:1, 1:M, M:N, expressed through connectivities and cardinalities)
- Constraints

To develop an accurate data model, the database designer must have a thorough and complete understanding of the organization's data requirements. The business rules are very important to the
designer because they enable the designer to fully understand how the business works and what role
is played by data within company operations.


Explain the following statement: a transaction is a logical unit of work.

A transaction is a logical unit of work that must be entirely completed of aborted; no intermediate
states are accepted. In other words, a transaction, composed of several database requests, is treated
by the DBMS as a unit of work in which all transaction steps must be fully completed if the transaction is to be accepted by the DBMS.
Acceptance of an incomplete transaction will yield an inconsistent database state. To avoid such a state, the DBMS ensures that all of a transactions database operations are completed before they are
committed to the database.


What is a consistent database state, and how is it achieved?

A consistent database state is one in which all data integrity constraints are satisfied. To achieve a consistent database state, a transaction must take the database from one consistent state to another.


List and discuss the five transaction properties.

Atomicity requires that all parts of a transaction must be completed or the transaction
is aborted. This property ensures that the database will remain in a consistent state.

Consistency Indicates the permanence of the database consistent state.

Isolation means that the data required by an executing transaction cannot be accessed by any other transaction until the first transaction finishes. This
property ensures data consistency for concurrently executing transactions.

Durability indicates that the database will be in a permanent consistent state after the execution of a transaction. In other words, once a consistent state is
reached, it cannot be lost.

Serializability means that a series of concurrent transactions will yield the same result as if they were executed one after another.

All five transaction properties work together to make sure that a database maintains data integrity
and consistency for either a single-user or a multi-user DBMS.


What does serializability of transactions mean?

Serializability of transactions means that a series of concurrent transactions will yield the same result as if they were executed one after another


What is a scheduler, what does it do, and why is its activity important to concurrency control?

The scheduler is the DBMS component that establishes the order in which concurrent database
operations are executed. The scheduler interleaves the execution of the database operations
(belonging to several concurrent transactions) to ensure the serializability of transactions. In other
words, the scheduler guarantees that the execution of concurrent transactions will yield the same
result as though the transactions were executed one after another. The scheduler is important because
it is the DBMS component that ensures transaction serializability. In other words, the scheduler
allows the concurrent execution of transactions, giving end users the impression that they are the DBMS's only users.


What is a lock, and how, in general, does it work?

A lock is used in concurrency control to guarantee the exclusive use of a data element to the transaction that owns the lock. For example, if the data element X is currently locked by transaction T1, transaction T2 will not have access to the data element X until T1 releases its lock.

A data item can be in only two states: locked (being used by some transaction) or
unlocked (not in use by any transaction). To access a data element X, a transaction T1 first must request a lock to the DBMS. If the data element is not in use, the DBMS will lock X to be used by T1 exclusively. No other transaction will have access to X while T1 is executed.


What are the different levels of lock granuality?

Lock granularity refers to the size of the database object that a single lock is placed upon. It can be:

- Database-level, meaning the entire database is locked by one lock.

- Table-level, meaning a table is locked by one lock.

- Page-level, meaning a diskpage is locked by one lock.

- Row-level, meaning one row is locked by one lock.

- Field-level, meaning one field in one row is locked by one lock.


What is concurrency control, and what is its objective?

Concurrency control is the act of coordinating the simultaneous execution of transactions in a multiprocessing or multiuser database management system.

The objective of concurrency control is
to ensure the serializability of transactions in a multiuser database management system. (The
DBMS's scheduler is in charge of maintaining concurrency control.)

Coz it helps to guarantee data integrity and consistency in a database system, concurrency control is one of the most critical activities performed by a DBMS. If concurrency control is not
maintained, 3 serious problems may be caused by concurrent transaction execution: lost updates,uncommitted data, and inconsistent retrievals.


What is a deadlock, and how can it be avoided? Discuss several strategies for dealing with deadlocks.

Although locks prevent serious data inconsistencies, their use may lead to two major problems:

1. The transaction schedule dictated by the locking requirements may not be serializable, thus
causing data integrity and consistency problems.

2. The schedule may create deadlocks. Database deadlocks are the equivalent of a traffic gridlock in a big city and are caused by two transactions waiting for each other to unlock data.

Deadlock prevention, Deadlock Detection and Deadlock Avoidance can be used to deal with them.