Database Architecture Flashcards

1
Q

Architecture

A

components of a computer system and the relationships between components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Tools

A

Directly interact with database users and administrators, and send queries to the query processor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Query Processor

A

Manages connections from multiple users and compiles queries into low-level instructions for the storage engine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Storage Engine/Storage Manager

A

Executes instructions, manages indexes, and interacts with the file system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

File System

A

Accesses data on storage media. The file system contains both system and user data, such as log files, tables, and indexes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MySQL Server/Enterprise

A

MySQL server - free, MySQL enterprise - paid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Monitor

A

Collects and displays information on CPU, memory, and index utilization, as well as queries and results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Audit

A

Keeps track of all database changes. It also tracks the time of change and who made the change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Utility Programs

A

Program suite includes 30 tools grouped into five categories. Installation, Client, Administrative, Developer, and miscellaneous tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Connection

A

Link between tools and the query processor. Each connection specifies a database name, server address, logon name, and password.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Execution plan

A

Detailed, low-level sequence of steps that specify exactly how to process a query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Query Parser

A

Check each query for syntax errors and converts valid queries to an internal representation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Query Optimizer

A

Reads the internal representation, generates alternative execution plans, estimated execution times, and selects the fastest plan. Estimates are based on heuristics and statistics about data, like the number of rows in each table and the number of values in each column. These statistics are maintained in the data dictionary described below.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Cache Manager

A

Query processor layer has a cache manager that stores reusable information in main memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Buffer Manager

A

Retains data blocks from the file system for possible reuse to reduce access time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Buffer

A

Data blocks retained in an area of main memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Least Recently Used/LRU

A

LRU algorithm tracks the time each block was last accessed and, when space is needed, discards ‘stale’ blocksC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Catalog, Data dictionary

A

Directory of tables, columns, keys, indexes, and other objects in a relational database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Tiers

A

Computers linked by a network are often grouped in layers, called tiers, and arranged in a hierarhcy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Single-Tier Architecture

A

Consists of a personal or corporate computer connected directly to monitors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Multi-tier Architecture

A

Complex corporate and government applications have increasingly been implemented in a multi-tier architecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Web Architecture

A

Consists of web browsers and web servers communicating over the internet. Web browsers on the top tier, webservers on the middle tier, application servers, and then services on the bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

On-premise

A

Installed and run on a customer computers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Cloud Services

A

Computer services on a lower tier of a web architecture. Cloud services are made available over the internet to customers for a monthly fee

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Infrastructure-as-a-service/IaaS

A

provides computer processing, memory, and storage media, as if the customer were renting a computer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Platform-as-a-service/PaaS

A

provides tool and services, such as a databases, application development tools, and messaging services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Virtual Machine/VM

A

Software layer that emulates a complete, independent computing environment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Cloud Database

A

database offered as a PaaS cloud service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Parallel Computer

A

Computer consists of multiple processor managed by a single operating system instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Shared Memory

A

Processor share the same memory and storage media

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Shared Storage

A

Processor share storage media only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Shared nothing

A

Processor share neither memory nor storage media

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Local Area Network

A

Consists of cable extending over a small area, typically within one facility. LAN usually use the Ethernet communication protocol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Wide Area Network

A

WAN spans multiple facilities in different geographic locations, separated by many miles. WAN communicate via cables, satellite, or telephone lines, often using internet communication protocols.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

CLuster

A

Group of nodes connected by a local area network, managed by separate operating systems instances, and coordinated by specialized cluster management software

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Node

A

Group of computers connected by either a local or wide area network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Parallel Database

A

Runs on a parallel computer or cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Distributed database

A

runs on multiple computers connected by a wide area network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Distributed transcation

A

Updates data on multiple nodes of a distributed database. Either all nodes or no nodes must be successfully updated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Two-phase Commit

A

Phase 1.1, central transaction coordinator notifies all participating nodes of the required updates. 1.2. Participating nodes receive the notification, store the update in a local log, and send a confirmation message. Phase 2.1 begins when the transaction coordinator receives confirmation from all participating nodes. The transition coordinator now instructs all nodes to commit. 2.2 participating nodes receive the commit message, commit the update to the database, and notify the transaction coordinator of success.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Local Transaction

A

Updates data on a single node of a distribute ddatabase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Synchronous

A

Distributed transaction are synchronous, since the updates occur at the same time from the perspective of the database user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Asynchronous

A

updates in separate local transactions are asynchronous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

eventually consistent

A

Databases that use local rather than distributed transactions are called eventually consistent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Consistent

A

Database conforms to all rules at all times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Available

A

live nodes must respond to queries at all times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Network partition

A

When a network error prevents nodes from communicating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Partition-tolerant

A

Continues to function when a network partition occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

CAP theorem

A

Distributed datatbase cannot simultaneously be Consistent, Available, and Partition-tolerant. Distributed database can guarantee any two, but not all three. of these properties.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Replica

A

Copy of an entire database, a table or a subset of table data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Replicated Database

A

Maintains two or more replicas on seperate storage devices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Storage arrays

A

Manage replicas internally, without database intervention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Primary/Secondary

A

designates one node as primary. All updates are first applied to the primary node in local transactions, Secondary nodes are updated after the primary node commits, with independent local transactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Group replication

A

Applies updates to any node in a group. If a node detects a conflict, an algorithm determines which transaction commits and which rolls back

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Central Catalog

A

Entire catalog resides on a single node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Replicated catalog

A

Copy of the catalog resides on each node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Operational Data

A

Conduct daily business functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Analytic Data

A

Data that is used to understand, manage, and plan the business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Reporting Data/Decision support Data.

A

Data that is reporting data or decision support data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Data warehouse

A

Seperate database optimized for analytics rather than operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Data mart

A

Data warehouse designed for a specific business area, such as sales, human resources, or product development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Extract-transform-load/ETL

A

Five-step process commonly referred to as the extract-transform-load or ETL process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

ETL tools

A

ETL is a time-consuming process so most organizations use special software products to minimize coding.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Dimensional Design/Star Schema

A

consists of fact and dimension tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

fact table

A

Contains numeric data used to measure business performance, such as sales revenue or number of employees. Each row in a fact table consists of numeric fact columns and foreign keys that reference dimension tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Dimension table

A

Contains textual data that described the fact data, such as product line, organizational unit, and geographical region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Dimension Hierarhhcy

A

Sequence of columns in which each column has a one-many relationship to the next column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Date dimension

A

Each row corresponds to a day. Contains 36500 rows (100 years x 365days per year)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Time Deimension

A

Each row corresponds to a minute of the day. Contains 1440 rows (24 hours x 60 minute per hour)

67
Q

Type 2 Design for slowly chanign dimension

A

Start and end foreign keys to the fact table

68
Q

In-memory Database

A

Database that stores data in main memory, instead of or in addition to storage media

69
Q

Embedded Database/In-Process Database

A

Database that is packaged with a programming language. Embedded database and application program execute together in a single software process

70
Q

SQLite

A

dominant embedded relational database

71
Q

SQL Server Compact

A

Embedded database form microsoft, discontinued support after 2021

72
Q

Libmysqld

A

Discontinued version that configured MySQL as embedded.

73
Q

Federated Database

A

Collection of two or more participating databases underneath a coordinating software layer. The participating databases are autonomous and heterogeneous

74
Q

Autonomous datbase

A

Operates independently of other participating databases. It is administered and can be queried as if the database were not part of a federated database

75
Q

Heterogeneous databases

A

Run under different database systems or have incompatible schema

76
Q

Middleware

A

coordinating sfotware layer, since the software lies between application programs and database software

77
Q

Global Catalog

A

Directory of participating database objects, such as tables, columns, and indexes

78
Q

Databse Wrapper

A

Converts the decomposed queries to the appropriate syntax for each participating database

79
Q

SQL/Management of External Data/SQL/MED

A

Extension of the SQL standard for federated databases

80
Q

Nickname

A

Database name for a participating database object, such as tables and columns

81
Q

User Mapping

A

Associates a federated database user with a participating database user

82
Q

Data Lake

A

Analytic database of raw, unprocessed data copied from multiple data sources. Data lakes share some characteristics of data warehouses and some characteristic of federated databases

83
Q

BackTick

A

Delimits literals that represent identifiers which allows spaces and reserved words to be used as identifiers.

84
Q

components of a computer system and the relationships between components

A

Architecture

85
Q

Directly interact with database users and administrators, and send queries to the query processor

A

Tools

86
Q

Manages connections from multiple users and compiles queries into low-level instructions for the storage engine

A

Query Processor

87
Q

Executes instructions, manages indexes, and interacts with the file system

A

Storage Engine/Storage Manager

88
Q

Accesses data on storage media. The file system contains both system and user data, such as log files, tables, and indexes

A

File System

89
Q

MySQL server - free, MySQL enterprise - paid

A

MySQL Server/Enterprise

90
Q

Collects and displays information on CPU, memory, and index utilization, as well as queries and results.

A

Monitor

91
Q

Keeps track of all database changes. It also tracks the time of change and who made the change.

A

Audit

92
Q

Program suite includes 30 tools grouped into five categories. Installation, Client, Administrative, Developer, and miscellaneous tools.

A

Utility Programs

93
Q

Link between tools and the query processor. Each connection specifies a database name, server address, logon name, and password.

A

Connection

94
Q

Detailed, low-level sequence of steps that specify exactly how to process a query

A

Execution plan

95
Q

Check each query for syntax errors and converts valid queries to an internal representation

A

Query Parser

96
Q

Reads the internal representation, generates alternative execution plans, estimated execution times, and selects the fastest plan. Estimates are based on heuristics and statistics about data, like the number of rows in each table and the number of values in each column. These statistics are maintained in the data dictionary described below.

A

Query Optimizer

97
Q

Query processor layer has a cache manager that stores reusable information in main memory

A

Cache Manager

98
Q

Retains data blocks from the file system for possible reuse to reduce access time

A

Buffer Manager

99
Q

Data blocks retained in an area of main memory

A

Buffer

100
Q

LRU algorithm tracks the time each block was last accessed and, when space is needed, discards ‘stale’ blocksC

A

Least Recently Used/LRU

101
Q

Directory of tables, columns, keys, indexes, and other objects in a relational database

A

Catalog, Data dictionary

102
Q

Computers linked by a network are often grouped in layers, called tiers, and arranged in a hierarhcy.

A

Tiers

103
Q

Consists of a personal or corporate computer connected directly to monitors

A

Single-Tier Architecture

104
Q

Complex corporate and government applications have increasingly been implemented in a multi-tier architecture

A

Multi-tier Architecture

105
Q

Consists of web browsers and web servers communicating over the internet. Web browsers on the top tier, webservers on the middle tier, application servers, and then services on the bottom

A

Web Architecture

106
Q

Installed and run on a customer computers

A

On-premise

107
Q

Computer services on a lower tier of a web architecture. Cloud services are made available over the internet to customers for a monthly fee

A

Cloud Services

108
Q

provides computer processing, memory, and storage media, as if the customer were renting a computer

A

Infrastructure-as-a-service/IaaS

109
Q

provides tool and services, such as a databases, application development tools, and messaging services

A

Platform-as-a-service/PaaS

110
Q

Software layer that emulates a complete, independent computing environment.

A

Virtual Machine/VM

111
Q

database offered as a PaaS cloud service

A

Cloud Database

112
Q

Computer consists of multiple processor managed by a single operating system instance

A

Parallel Computer

113
Q

Processor share the same memory and storage media

A

Shared Memory

114
Q

Processor share storage media only

A

Shared Storage

115
Q

Processor share neither memory nor storage media

A

Shared nothing

116
Q

Consists of cable extending over a small area, typically within one facility. LAN usually use the Ethernet communication protocol.

A

Local Area Network

117
Q

WAN spans multiple facilities in different geographic locations, separated by many miles. WAN communicate via cables, satellite, or telephone lines, often using internet communication protocols.

A

Wide Area Network

118
Q

Group of nodes connected by a local area network, managed by separate operating systems instances, and coordinated by specialized cluster management software

A

CLuster

119
Q

Group of computers connected by either a local or wide area network

A

Node

120
Q

Runs on a parallel computer or cluster

A

Parallel Database

121
Q

runs on multiple computers connected by a wide area network

A

Distributed database

122
Q

Updates data on multiple nodes of a distributed database. Either all nodes or no nodes must be successfully updated

A

Distributed transcation

123
Q

Phase 1.1, central transaction coordinator notifies all participating nodes of the required updates. 1.2. Participating nodes receive the notification, store the update in a local log, and send a confirmation message. Phase 2.1 begins when the transaction coordinator receives confirmation from all participating nodes. The transition coordinator now instructs all nodes to commit. 2.2 participating nodes receive the commit message, commit the update to the database, and notify the transaction coordinator of success.

A

Two-phase Commit

124
Q

Updates data on a single node of a distribute ddatabase

A

Local Transaction

125
Q

Distributed transaction are synchronous, since the updates occur at the same time from the perspective of the database user

A

Synchronous

126
Q

updates in separate local transactions are asynchronous

A

Asynchronous

127
Q

Databases that use local rather than distributed transactions are called eventually consistent

A

eventually consistent

128
Q

Database conforms to all rules at all times

A

Consistent

129
Q

live nodes must respond to queries at all times

A

Available

130
Q

When a network error prevents nodes from communicating

A

Network partition

131
Q

Continues to function when a network partition occurs

A

Partition-tolerant

132
Q

Distributed datatbase cannot simultaneously be Consistent, Available, and Partition-tolerant. Distributed database can guarantee any two, but not all three. of these properties.

A

CAP theorem

133
Q

Copy of an entire database, a table or a subset of table data

A

Replica

134
Q

Maintains two or more replicas on seperate storage devices

A

Replicated Database

135
Q

Manage replicas internally, without database intervention

A

Storage arrays

136
Q

designates one node as primary. All updates are first applied to the primary node in local transactions, Secondary nodes are updated after the primary node commits, with independent local transactions.

A

Primary/Secondary

137
Q

Applies updates to any node in a group. If a node detects a conflict, an algorithm determines which transaction commits and which rolls back

A

Group replication

138
Q

Entire catalog resides on a single node.

A

Central Catalog

139
Q

Copy of the catalog resides on each node

A

Replicated catalog

140
Q

Conduct daily business functions

A

Operational Data

141
Q

Data that is used to understand, manage, and plan the business

A

Analytic Data

142
Q

Data that is reporting data or decision support data

A

Reporting Data/Decision support Data.

143
Q

Seperate database optimized for analytics rather than operations

A

Data warehouse

144
Q

Data warehouse designed for a specific business area, such as sales, human resources, or product development

A

Data mart

145
Q

Five-step process commonly referred to as the extract-transform-load or ETL process

A

Extract-transform-load/ETL

146
Q

ETL is a time-consuming process so most organizations use special software products to minimize coding.

A

ETL tools

147
Q

consists of fact and dimension tables

A

Dimensional Design/Star Schema

148
Q

Contains numeric data used to measure business performance, such as sales revenue or number of employees. Each row in a fact table consists of numeric fact columns and foreign keys that reference dimension tables.

A

fact table

149
Q

Contains textual data that described the fact data, such as product line, organizational unit, and geographical region

A

Dimension table

150
Q

Sequence of columns in which each column has a one-many relationship to the next column

A

Dimension Hierarhhcy

151
Q

Each row corresponds to a day. Contains 36500 rows (100 years x 365days per year)

A

Date dimension

152
Q

Each row corresponds to a minute of the day. Contains 1440 rows (24 hours x 60 minute per hour)

A

Time Deimension

153
Q

Start and end foreign keys to the fact table

A

Type 2 Design for slowly chanign dimension

154
Q

Database that stores data in main memory, instead of or in addition to storage media

A

In-memory Database

155
Q

Database that is packaged with a programming language. Embedded database and application program execute together in a single software process

A

Embedded Database/In-Process Database

156
Q

dominant embedded relational database

A

SQLite

157
Q

Embedded database form microsoft, discontinued support after 2021

A

SQL Server Compact

158
Q

Discontinued version that configured MySQL as embedded.

A

Libmysqld

159
Q

Collection of two or more participating databases underneath a coordinating software layer. The participating databases are autonomous and heterogeneous

A

Federated Database

160
Q

Operates independently of other participating databases. It is administered and can be queried as if the database were not part of a federated database

A

Autonomous datbase

161
Q

Run under different database systems or have incompatible schema

A

Heterogeneous databases

162
Q

coordinating sfotware layer, since the software lies between application programs and database software

A

Middleware

163
Q

Directory of participating database objects, such as tables, columns, and indexes

A

Global Catalog

164
Q

Converts the decomposed queries to the appropriate syntax for each participating database

A

Databse Wrapper

165
Q

Extension of the SQL standard for federated databases

A

SQL/Management of External Data/SQL/MED

166
Q

Database name for a participating database object, such as tables and columns

A

Nickname

167
Q

Associates a federated database user with a participating database user

A

User Mapping

168
Q

Analytic database of raw, unprocessed data copied from multiple data sources. Data lakes share some characteristics of data warehouses and some characteristic of federated databases

A

Data Lake

169
Q

Delimits literals that represent identifiers which allows spaces and reserved words to be used as identifiers.

A

BackTick