Database Terms And Internet Terms Flashcards

1
Q

Data

A
  • smallest unit of info
  • eg employees hourly wage rate
  • processed and stored as bits and bytes
  • data is raw numbers and information is processed data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Record

A
  • a log for eg of employees hourly wage rate, hours worked in a week
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

File

A

A collection of related records. An example is a payroll file containing payroll records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Directory

A

The list of files stored on a disk (e.g., payroll directory).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Dataset

A

A collection of related bytes, or characters, of secondary storage. For example, a dataset may be a file of payroll records or a library of payroll programs.

In z/OS, a data set is a named collection of related data records that is stored and retrieved by an assigned name. A data set is equivalent to a file in other operating systems. Data sets are stored on tape or disks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Database

A

A collection of interrelated data stored together, using a common and controlled approach (e.g., payroll)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Database System

A

Data is maintained independently of the application programs.
Data can be shared by many programs and users.
Database management system (DBMS) software manages and controls the data and the database software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Dictionary

A

Contains attributes and characteristics of each data element or field in a computer record. It also includes file organization and structure and edit and validation rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Schema

A

A set of specifications that defines a database.

Specifically, it includes entity names, sets, groups, data items, areas, sort sequences, access keys, and security locks.

A logical view of an entire database is called a schema. Schemas may be external, conceptual, or internal. A synonym for the word “schema” is “view.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Subschema

A

A subset of a schema. It represents a portion of a database as it appears to a user or application program.

A subschema is a part of schema. In other words, a schema is made up of one or more subschemas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Subject

A

A person who is using a computer system (e.g., employee, contractor, and consultant).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Object

A

A passive entity that contains or receives information. Examples of objects are data, records, blocks, files, and programs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Access

A

A specific type of interaction between a subject (e.g., user) and an object (e.g., data) that results in the flow of information from one to the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Check-point

A

A point, generally taken at regular intervals, at which a program’s intermediate results are dumped to a secondary storage (e.g., disk) to minimize the risk of work loss. Databases operate on checkpoints.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Deadlock

A

A consequence of poor resource management
Occurs when two programs each control a resource (e.g., printer, data file, database, and record) needed by the other and neither is willing to give in its resource.
Databases can run into deadlocks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Tuple

A

A row of a relational table in a relational database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Rollback

A

Restores the database from one point in time to an earlier point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Roll forward

A

Restores the database from a point in time when it is known to be correct to a later time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Recovery

A

The process of reconstituting a database to its correct and current state following a partial or complete hardware, software, network, operational, or processing error or failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a DBMS comprised of?

A

Software, hardware and procedures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What should the DBMS be compatible with?

A

The OS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Advantages of a DBMS

A

1) Min. Data redundancy resulting in data consistency
2) Data independence from application programs except during computer processing
3) Consistent and quality information for decision making
4) Adequate security and integrity controls
5) Shared access to data
6) Single storage location for each data item
7) Built in backup and recovery procedures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Disadvantages of a DBMS

A

1) Can be expensive to acquire, operate, and maintain.
2) Requires additional main memory.
3) Requires additional disk storage.
4) Requires knowledgeable and technically skilled staff (e.g., database administrator [DBA] and data administrator).
5) Results in additional system overhead, thereby slowing down the system response time.
6) Needs additional CPU processing time.
7) Requires sophisticated and efficient security mechanisms.
8) Is difficult to enforce security protection policies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

When is redundancy of data sometimes necessary?

A

When high system performance and high data availability are required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the trade-off with regards to the redundancy of data?

A

The trade-off here is the cost of collecting and maintaining the redundant data and the system overhead it requires to process the data. Another concern is synchronization of data updates in terms of timing and sequence. Ideally, the synchronization should be done at the system level rather than the application level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the primary function of a DBMS?

A

Store data and to provide operations on the database.

Operations usually include create, delete, update, and search of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the essential features supported by most DBMS?

A

1) Persistence - is the property wherein the state of the database survives the execution. Of a process in order to be reused later in another process.
2) Data sharing - is the property that permits simultaneous use of the database by multiple users.
3) Recovery - Refers to the capability of the DBMS to return its data to a consistent and coherent state after a hardware or software failure.
4) Database language - permits external access to the DBMS.
5) Security and integrity - security and authorisation control, integrity checking, utility programs, backup/archiving, versioning, and view definition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

A DBMS that permits sharing must provide what control?

A

Concurrency control (locking) mechanism that prevents users from executing inconsistent actions on the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Name database languages

A

Data definition language (DDL) - used to define database schema and subschema

Data manipulation language (DML) - used to examine and manipulate contents of the database

Data control language (DCL) - used to specify parameters needed to define the internal organisation of the database such as indexes and buffer size.

Ad hoc query language - is provided for interactive specification of queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Name 2 types of integrity checking

A

Semantic - declaration of semantic and structural integrity rules and the enforcement of these rules. May be automatically enforced at program run time or at compile time or may be performed only when a message is sent.

Referential - No record may contain a reference to the primary key of a non existent record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Describe the relationship among database models

A

User needs - Conceptual model or user views - logical/external model - physical/internal model

  • User requirements are specified to conceptual model first (user views).
  • When the conceptual model is presented to the DBMS, it becomes a logical model/external model/schema/subschema.
  • The logical model is converted to a physical model(internal model) in terms of physical storage media such as magnetic disks, tapes, disk arrays.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Describe dependencies between DBMS and conceptual model.

A

The type of DBMS is not a factor in designing a conceptual model, but the design of a logical model is dependent on the type of DBMS to be used.

This means that the conceptual model is, or should be, independent of a DBMS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Logical Database Design

A
  • process of determining an information system structure that is independent of software or hardware considerations.
  • It produces logical data structures consisting of a number of entities connected by one-to-one or one-to-many relationships, subject to appropriate integrity checking.
  • The objective is to improve the effectiveness of an information system by maximizing the accuracy, consistency, integrity, security, and completeness of the database.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Physical Database Design

A
  • the implementation of a logical design in a particular computer system environment.
  • deals with retrieval and update workloads for the system and the parameters required (i.e., average time required for random/sequential access to a track, length of a track, and disk cylinder sizes) for the hardware environment.
  • The objective is to improve information system performance by minimizing the data entry time, data retrieval time, data update time, data query time, and storage space and costs.
35
Q

For large, logically complex databases, physical design is an extremely difficult task. Typically, an enormous number of alternatives must be explored in searching for a good physical design. Often optimal or near-optimal designs cannot be discovered, resulting in the creation of inefficient and costly databases. Suggested action steps required in a physical database design are listed next

A

1) Analyze workload complexity and characteristics.
2) Translate the relationships specified in the logical data structures into physical records and hardware devices, and determine their relationships. This includes consideration of symbolic and direct pointers. Symbolic pointers contain the other’s logical identifier. Direct pointers contain the other’s physical address. Both pointers can coexist.
3) Fine-tune the design by determining the initial record loading factors, record segmentations, record and file indexes, primary and secondary access methods, file block sizes, and secondary memory management for overflow handling.

36
Q

Compare physical data model with logical data model using following criteria:

  • concerned with schemas
  • concerned with entities
  • describes how data…
  • Nature
A

Physical

  • concerned with physical storage of data (internal schema)
  • concerned with entities for which data are collected
  • describes how data are arranged in the defined storage media (eg disk) from program and programmer viewpoints
  • physical in nature - describes the way data is physically located in the database

Logical

  • concerned with user-oriented data views (external schema)
  • concerned with entities for which data are collected
  • describes how data can be viewed by the designated end user
  • conceptual in nature (conceptual schema) - describes overall logical view of the database
37
Q

What is a data model?

A

Describes relationships between the data elements and is used as a tool to represent the conceptual organization of data.

38
Q

A relationship within a data model can be….

A

One to one - one bed assigned to one patient
One to many - one hospital room, many patients
Many to Many - one surgeon may attend to many patients, and a patient may be attended by more than one surgeon

39
Q

A data model can be considered as consisting of 3 components:

A

1) data structure - basic building blocks describing the way data are organised
2) Operators - set of functions that can be used to act on the data structures
3) Integrity rules - the valid states in which the data stored in the database may exist

40
Q

What is the primary purpose of a data model?

A

Is to provide a formal means of representing information and a formal means of manipulating the representation.

41
Q

Name 6 data model types

A

1) Relational
2) Hierarchical
3) Network
4) Inverted file
5) Object
6) Distributed

42
Q

Relational Data Model

A
  • Consists of columns = data fields, and rows = data records, represented in a table.
  • Data is stored in tables with keys or indexes outside the program.
  • Columns of tables are called attributes, rows are called tuples.
43
Q

Properties of a relational data model

A

1) all “key” values are defined
2) duplicate rows do not exist
3) column order is not significant
4) row order is not significant

44
Q

Advantages of relational model

A

simplicity in use and true data independence from data storage structures and access methods.

45
Q

Disadvantages of relational model

A

low system performance and operational efficiency compared to other data models

46
Q

Hierarchical data model

A

Can be related to a family tree concept
a number of trees or data records forma database
Every branch has a number of leaves or data fields
Consists of nodes and branches. Highest node is called a “root” (parent-level 1_ and its every occurrence begins a logical database record.
The dependent nodes are at the lower levels (children - level 2, 3….)

47
Q

Properties of a hierarchical data model

A
  • model always starts with a root node
  • a parent node must have at least one dependent node
  • every node except the root must be accessed through its parent node
  • except at level 1, the root node, the dependent node can be added horizontally as well as vertically with no limitations
  • there can be a number of occurrences of each node at each level
  • every node occurring at level 2 must be connected with one and only one node occurring at level 1, and is repeated down.
48
Q

Advantages of hierarchical model

A

proven performance, simplicity, ease of use and reduction of data dependency

49
Q

Disadvantages of hierarchical model

A

addition and deletion of parent/child nodes can become complex and deletion of parent results in deletion of children

50
Q

Network Data Model

A

Is depicted using blocks and arrows.
Block represents a record type or an entity.
Each record type is composed of zero, one or more data elements/fields or attributes.
An arrow linking two blocks shows the relationship between 2 records types.

51
Q

Network database

A

Consists of a number of areas.
An area contains records, which in turn contain data elements or fields.
A set (grouping of records), may reside in an area or span a number of areas.
Each area can have its own unique physical attributes.
Areas can be operated independently of, or in conjunction with other areas.

52
Q

Inverted file data model

A

Each entity is represented by a file
Each record in the file represents an occurrence of the entity
Each attribute becomes a data field or element in the inverted file

53
Q

Properties of a network data model

A

A set is composed of related records
There is only a single owner in a set
There may be zero, one or many members in a set

54
Q

Advantages of network data model

A

Proven performance and accommodation of many-to-many relationships that occur quite frequently

55
Q

Disadvantages of network data model

A

Complexity in programming

Loss of data independence during database reorganisation and when sets are removed

56
Q

Properties of inverted file data model

A

Each entity is represented by a file
Each record in the file represents an occurrence of the entity
Each attribute becomes data field or element in the file
Data fields are inverted to allow efficient access to individual files
To accomplish this, an index file is created containing all the values taken by the inverted field and pointers to all records in the file.

57
Q

Advantages of inverted file data model

A

Simplicity
Data independence
Ease of adding new files and fields

58
Q

Disadvantages of inverted file data model

A

Difficulty in synchronising changes between database records/fields and index file.

59
Q

Object Data Model

A
  • Developed by combining the special nature of object-oriented programming languages with DBMS.
  • Objects, classes and inheritance form the basis for the structural aspects of the object data model.
  • Objects are basic entities that have data structures and operations.
  • Every object has an object ID that is a unique, system-provided identifier.
  • Classes describe generic object types.
  • All objects are members of a class.
  • Classes are related through inheritance.
  • Classes can be related to each other by superclass or subclass relationships.
  • Class definitions are the mechanism for specifying the database schema for an application.
60
Q

Advantages of object data model

A

System development efficiency and handling of complex data structures

61
Q

Disadvantages of object data model

A

New technology and new risks, which requires training and learning curves.

62
Q

Distributed Data Model

A

Data resides in more than one physical database in the network.
Location transparency, in which the user does not need to know where data are stored, is one major goal of distributed data model.
Similarly, programmers do not have to rewrite applications and can move data from one location to another, depending on need.

63
Q

Database Check Points

A

A technique used to start at certain points in the execution of a program after the system fails or detects an error.

64
Q

Advantage and disadvantage of check point

A

ADV - relatively easy to implement in batch programs
Disadvantages - cumbersome to implement for online programs due to concurrent processing. They also degrade system performance.

65
Q

Checkpoints - a trade off exists between the number of checkpoints and time interval between 2 checkpoints.

A
  • Database designer needs to balance the number of checkpoints and time interval between 2 checkpoints.
  • Higher the number of checkpoints, the greater the degradation of performance, even though recovery process is easier.
  • If time interval between 2 check points is long, however, performance degradation is reduced but recovery is more difficult.
66
Q

Criteria for designing and implementing checkpoints are

A

1) Time interval
2) Operator action
3) No of changes to database
4) No of records written to log tape
5) No of transactions processed

67
Q

Database Compression Techniques

A
  • Common to find unused space in database due to deletion of many records
  • unused space widens distance between active database records, resulting in longer time for data retrieval.
  • compression or compaction techniques can be used to reduce the amount of storage space required for a given collection of data records
68
Q

Adv and Disadvantages of database compression techniques

A

Adv - saves storage space and saves disk input/output operations
Disadvantages - CPU activity will increase to decompress the data after it has been retrieved
Trade off exists between the input/output savings and additional CPU activity.

69
Q

Database Reorganisation - when is this required?

A

A deletion of some records in the database results in a fragmentation of space or unused space - could happen during initial loading or after reloading of the database.

Other reorganisation efforts could result from changing block sizes, buffer pool sizes, prime areas and overflow areas.

70
Q

A normal practice is to reorganise the database by

A
  • copying the old database onto another device, such as disk or tape
  • reblocking the valid records
  • reloading the valid records
  • excluding the records marked “deleted” during this process.

Reorganisation can arrange records in such a way that their physical sequence is the same or nearly the same as their logical sequence.

Also possible to arrange the records so that the more frequently access ones are stored on disk, rarely accessed records are stored on tape.

71
Q

Database Restructuring Vs Database Reorganisation

A

Restructuring - major activity, affects existing application systems and procedures.

Reorganisation - minor activity, does not affect existing application systems and procedures

72
Q

3 types of changes in restructuring

A

1) Logical changes - adding or deleting data elements, combining a no of records, changing relationship between records
2) Physical changes - in terms of channels and disk configuration to minimise contention(delays) by adding or removing some pointers.
3) Procedural changes - in terms of backup and recovery procedures and access control security rules

73
Q

Database Performance Monitoring

A

Often a performance monitoring tool and/or utility program is utilised to take internal readings of the database and its components.

Objective is to identify performance related problems and take corrective action.

74
Q

What is a data dictionary or directory?

A

Alphabetical listing that describes all the data elements in an application system and tells how and where they are used.

75
Q

Data editing and validation rules in the data dictionary ca be used to

A

Prevent the entry of inaccurate data into the system

76
Q

The Data Dictionary can be used as a

A

Corrective control because of its “where-used” information, which can be used to trace data backward and forward through the transaction. As an audit trail.

77
Q

Automated vs manual data dictionary

A

Usually automated software is used to manage and control the DD.
A manual DD can become inconsistent with what is actually in the system in a very short time.
Automated DD supports the objectives of minimum data redundancy, maximum data consistency and adequate data integrity and security.

78
Q

Data Dictionary can be dependent on a DBMS or it can be stand-alone

A

Dependent DD uses underlying DBMS to manage and control its data and it is a part of DBMS.

A stand-alone DD is a separate package from the DBMS package.

79
Q

Data dictionary may be active or passive with the DBMS software.

A

Active DD - requires all data descriptions for a database defined or available at one time

Passive DD - may or may not require a check for currency of data descriptions before a program is executed.

80
Q

Advantages of active Data Dictionary System

A
  • Provides quick access to the data in the database
  • Tracks database accesses and actions
  • Provides valuable statistics for improving system performance
  • Minimizes redundancy in storage of data descriptions
  • Facilitates system documentation
  • Improves data editing and validation controls
  • Works well with database files
81
Q

Advantages of Passive Data Dictionary System

A
  • Less risk of commitment to a DBMS
  • Easier to implementation
  • Can describe data descriptions on a piecemeal basis
  • Works well with conventional data files
  • Serves as a documentation and communication tool
82
Q

The major reports that can be obtained from a data dictionary and its interface systems include:

A
  • access control reports
  • audit trail reports
  • cross-reference reports
  • data elements and their relationships with their usage frequencies
  • summary, change, error and ad hoc reports
83
Q

A data dictionary provides these benefits

A
  • It provides a consistent description of data as well as consistent data names for programming and data retrieval. - - This in turn provides consistent descriptive names and meanings.
  • It shows where-used information, such as what programs used the data items, which files contain the data items, and which printed reports display the data items.
  • It provides data integrity through data editing and validation routines.
  • It supports elimination of data redundancy.
  • It supports tracing of data item’s path through several application programs.
  • It describes the relationships among the entities.