Chapter 6 Database Architecture Flashcards

1
Q

What are the components of MySQL architecture?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the four layers of MySQL architecture?

A

The four layers are Tools Layer, Query Processor, Connectors, and Storage Engine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Tools Layer in MySQL architecture?

A

The Tools Layer includes user interfaces and applications for database interaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the Query Processor do?

A

The Query Processor manages connections and compiles SQL queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are Connectors in MySQL?

A

Connectors are APIs that link applications to the query processor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the purpose of APIs in MySQL?

A

APIs are older programming interfaces for database interaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is MySQL Workbench?

A

MySQL Workbench is a desktop application for managing MySQL databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are Utility Programs in MySQL?

A

Utility Programs are tools for database administration and management.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Command-Line Client?

A

The Command-Line Client is an interactive tool for processing SQL queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are Database Transactions?

A

Database Transactions are operations that ensure data integrity and consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is System Data in MySQL?

A

System Data in MySQL is the information stored in the mysql system schema. This schema contains tables with metadata about the server itself, including:

  • Data Dictionary Tables: Store information about database objects like tables and columns.
  • System Tables: Used for operational purposes like user accounts and server logs.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do users interact with User Data in MySQL?

A

User Data is created and managed by database users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are High-End Installations?

A

High-End Installations are advanced setups requiring additional Enterprise features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a Connection in MySQL?

A

A Connection specifies the database name and credentials.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the Connection Manager do?

A

The Connection Manager creates and manages connections to databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a Query Compilation?

A

Query Compilation generate a detailed execution plan for queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the CSV Storage Engine?

A

The CSV Storage Engine stores data in comma-separated values format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the MEMORY Storage Engine?

A

The MEMORY Storage Engine stores data in RAM for fast access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a Storage Engine?

A

A Storage Engine manages how data is stored in tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Who is a Database Administrator?

A

A Database Administrator manages database configurations and storage engines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is Block Replacement?

A

Block Replacement is the process of removing stale blocks from the buffer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a Data Update?

A

A Data Update is the modification of existing data in a block.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is Buffer Capacity?

A

Buffer Capacity is the maximum number of blocks the buffer can hold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Block Discarding?

A

Block Discarding is removing blocks from the buffer to free space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the File System Layer?
The File System Layer **organizes data** stored on physical storage media.
26
What are Database Queries?
Database Queries are requests for data retrieval or manipulation.
27
Multi-Tier Architecture is used how in management?
Multi-Tier Architecture *groups computers* in layers for **application management**.
28
What is the Top Tier in architecture?
The Top Tier interacts directly with **end-users**.
29
What does the Bottom Tier manage?
The Bottom Tier *manages resources* like databases and email.
30
What functions does the Middle Tier execute?
The Middle Tier **executes functions** like user authorization and business logic.
31
What is Web Architecture?
Web Architecture is a **multi-tier architecture** for web applications.
32
What role do Web Browsers play in web architecture?
Web Browsers are the top tier managing **user interaction**.
33
What do Web Servers do?
Web Servers are the middle tier *generating web pages* and *handling requests*.
34
What is the function of Application Servers?
Application *Servers process requests and communicate with* databases.
35
What does the Services Tier provide?
The Services Tier provides database and *authentication services*.
36
What is Single-tier architecture?
Single-tier architecture contains *one layer of software* components.
37
What does Multi-tier architecture contain?
Multi-tier architecture contains two or more *interconnected software* tiers.
38
What is On-premise software?
On-premise software is installed and run on *customer computers*.
39
What are Cloud _services_?
Cloud services are *software provided* over the internet for a fee.
40
What is Infrastructure-as-a-Service (IaaS)?
IaaS involves renting virtual computing resources from cloud providers. ## Footnote [Learn more](https://g.co/gemini/share/281879238c72)
41
What does Platform-as-a-Service (PaaS) provide?
PaaS provides a platform for developers to create and manage applications
42
What is Software-as-a-Service (SaaS) & Platform as a Service?
1. SaaS is about delivering *ready-to-use software* applications (like renting a furnished apartment) 2. PaaS provides a platform for developers to create and manage applications (like leasing land to build a house).
43
What is Elastic Compute Cloud (EC2)?
EC2 is Amazon's IaaS offering for virtual computing.
44
What is Microsoft Azure?
Microsoft Azure is Microsoft's **PaaS** offering with various cloud services.
45
What is Salesforce?
Salesforce is a **SaaS** for sales management and customer relations.
46
What is a Cloud database?
A Cloud database is offered as a **PaaS** cloud service.
47
What is a Virtual machine (VM)?
A VM emulates a complete *computing environment* on a server.
48
Who is a Customer in cloud services?
A Customer is an end-user accessing cloud services.
49
What is Storage media?
Storage media are physical **devices** for storing *data in the cloud*.
50
What is a Network?
A Network connects multiple computers for data exchange.
51
What is an Operating system?
An Operating system is **software managing** computer *hardware and resources*.
52
What is a Database?
A Database is a *structured collection of data* for storage and retrieval.
53
What is a **Server** in a computer network?
A Server is **a computer** providing data or services *to other* computers.
54
What is a Shared memory computer?
A Shared memory computer has processors that share the *same memory and storage*.
55
What is a Shared storage computer?
A Shared storage computer has processors that share storage but *have private memory*.
56
What is a Shared nothing computer?
A Shared nothing computer has processors that share neither memory nor storage.
57
What are Parallel computers?
Parallel computers have *multiple processors* working under a single OS.
58
What is a Local area network?
A Local area network consists of cables connecting devices within a small area.
59
What is the Ethernet protocol?
The Ethernet protocol is a common *communication protocol* for local area networks.
60
What are Data transmission risks?
Data transmission risks are potential vulnerabilities when sending data online.
61
What is a *corp* Cloud service provider?
Cloud service providers are companies *offering database management* in the cloud.
62
What is Scalable storage?
Scalable storage is adjustable storage capacity based on customer needs.
63
What are Peak processing loads?
Peak processing loads refer to *high demand periods* requiring additional resources.
64
What is a Wide Area Network?
A Wide Area Network spans multiple facilities across large distances.
65
What is a Node in networking?
A Node is a computer connected in a network.
66
What is Shared Memory?
Shared Memory is when **multiple processors** share both memory and storage.
67
What is Shared Storage?
Shared Storage is when **multiple processors** share storage *but not memory*.
68
What is Shared Nothing?
Shared Nothing means each processor has **private** memory and storage.
69
What is the Ethernet Protocol?
The Ethernet Protocol is a standard for connecting computers in a network.
70
What is a Parallel Database?
A Parallel Database **runs on** parallel computers or clusters.
71
What is a Distributed Database?
A Distributed Database operates *on multiple computers* via a wide area network.
72
What is Query Decomposition?
Query Decomposition is breaking queries into parts for concurrent execution.
73
What is a Local Area Network?
A Local Area Network covers a small geographic area.
74
What are Technical Challenges in databases?
Technical Challenges arise from *slow* wide area networks.
75
What is a Unified View of Data?
A Unified View of Data **provides** *consistent data representation* for users and programmers.
76
What is Data Location Impact?
Data Location Impact refers to the significance of **data placement** on *query processing*.
77
What are Remote Queries?
Remote Queries access data from different locations.
78
What is Cluster Management Software?
This software coordinates nodes in a cluster. ## Footnote Node: computer on network
79
What are Memory Access Limitations?
Memory Access Limitations indicate **local networks** can't support shared memory access.
80
What is Data Visibility?
Data Visibility means only administrators see *physical data locations*.
81
What is Concurrency in databases?
Concurrency is the simultaneous execution of processes or queries.
82
What is Query Execution Speed?
Query Execution Speed is the *time taken to process* database queries.
83
What is Employee Data Storage?
Employee Data Storage refers to **local nodes storing data** for geographically distributed employees.
84
What are Communication Protocols?
Communication Protocols are *rules for data exchange* between networked devices.
85
What is a Processor?
A Processor is a component that executes instructions and processes data.
86
What is the Two-Phase Commit protocol?
The Two-Phase Commit protocol *ensures all nodes* commit or rollback updates.
87
What is a Transaction Coordinator?
- A transaction coordinator is a *software component that manages* distributed transaction processes. - It ensures that all parts of a transaction are completed successfully or that the entire transaction is rolled back if there is an error.
88
What is a Local Log?
A Local Log is temporary storage *for pending* updates on nodes.
89
What is a Commit Message?
A Commit Message is an **instruction** to finalize updates in a database.
90
What is Rollback?
Rollback is the reversal of changes made during a transaction.
91
What is a Confirmation Message?
A Confirmation Message is an acknowledgment *from nodes* to the transaction coordinator.
92
What is Data Assignment?
Data Assignment is the distribution of data **across nodes** in a database.
93
What is Node Availability?
Node Availability refers to the status of *nodes being operational* during transactions.
94
Who are Database Administrators?
Database Administrators are professionals *managing and optimizing* database performance. ## Footnote Aka root account
95
What are Distributed Database Characteristics?
Distributed Database Characteristics *indicate nodes are not visible* to users or programmers.
96
What is Two-phase locking?
*Two-phase locking* controls lock acquisition in transactions. ## Footnote [πŸ“–](https://share.evernote.com/note/ab75bacb-a62a-5c1f-a234-9749052a2c93)
97
What is Two-phase locking **Phase 1**?
In Phase 1, the coordinator notifies the nodes of updates.
98
What is Two-phase locking; **Phase 1 / 2**?
1. Phase 1 (Growing): Transactions _acquire_ all necessary locks. 2. Phase 2 (Shrinking): Transactions _release_ all locks and commit.
99
What is a Commit message?
- A Commit message instructs nodes to finalize updates. - The commit message flows from the [coordinator to the nodes](https://share.evernote.com/note/89fd9210-e3e2-d0d7-ca30-0916ede86dff), confirming the update.
100
What is a Distributed transaction?
A Distributed transaction involves multiple nodes *for data updates*.
101
What is a Local transaction?
A Local transaction updates data on a single node only.
102
What is Temporary inconsistency?
Temporary inconsistency occurs when nodes are *updated at different times*, causing discrepancies.
103
What is a Debit Node?
A Debit Node is responsible for processing debit transactions.
104
What is a Credit Node?
A Credit Node is responsible for processing credit transactions.
105
What is the CAP Theorem?
The CAP Theorem **limits** simultaneous *consistency, availability, and partition* **tolerances**.
106
What is a Network Partition?
A Network Partition is a failure; preventing nodes from communicating with each other.
107
What is a Partition-Tolerant Database?
A Partition-Tolerant Database **continues functioning** despite network partitions.
108
What is a Consistent Database?
A Consistent Database requires data across nodes to match and conform.
109
What is an SQL Update Statement?
An SQL Update Statement is a command to modify existing data in a database.
110
What is an SQL Select Statement?
An SQL Select Statement is π“ͺ 𝓬𝓸𝓢𝓢π“ͺ𝓷𝓭 𝓽𝓸 **retrieve data** from a database.
111
What is Percentage Availability?
Percentage Availability is the *proportion of time a database is responsive* to users.
112
What is a Node in a distributed database?
A Node is an individual unit within a distributed database system.
113
What is Table Data Partition?
Table Data Partition is a subset of data within a database table.
114
What is a Live Node?
A Live Node is a node that actively responds to queries.
115
What is a Replicated Database?
A Replicated Database maintains multiple copies on different devices.
116
What is a Simple Backup?
A Simple Backup involves *backing up one replica* while others are active.
117
What is Enhanced Security?
Enhanced Security refers to improved data protection through multiple replicas.
118
What is Shared Nothing Architecture?
Shared Nothing Architecture is where nodes do not share storage.
119
What is Shared _Storage_?
Shared Storage allows multiple **nodes** to access common storage.
120
What is Shared _Memory_?
Shared Memory is an architecture where **processors** share a common memory.
121
What is Complex Server Administration?
Complex Server Administration involves *managing update propagation* across replicas.
122
What is Replication?
Replication is used in parallel and distributed databases.
123
What are Frequent Reads?
Frequent Reads are a common scenario for using database replication. ## Footnote [Learn more](https://share.evernote.com/note/8d97a231-e0a3-df5b-c658-6ff7ff733d70)
124
What are Infrequent Updates?
- Infrequent Updates occur less often than reads. - The frequency of data reads is usually much higher than the frequency of data updates.
125
What is Temporary Inconsistency in replication?
Temporary Inconsistency is an acceptable state during data replication. ## Footnote [Learn more](https://share.evernote.com/note/f27d317a-6a36-ec66-cb24-8cf039532127)
126
What is a Catalog in databases?
A Catalog is a *directory of information* describing database **objects**.
127
What is a Central Catalog?
A Central Catalog is **catalog information** stored on a single node.
128
What is Group Replication?
Group Replication is a technique allowing *updates from any node* in the **group**.
129
What do Double-Headed Arrows indicate?
Double-Headed Arrows indicate **bidirectional communication** between nodes and the cloud.
130
What are Animation Captions?
Animation Captions are visual aids explaining the *database update process*.
131
What is Transaction Information?
Transaction Information is data sent to nodes *before commit* to prevent conflicts.
132
What is a Database Management System?
A Database Management System is software for managing databases and processing queries.
133
What are Replicated Databases?
Replicated Databases maintain copies of data across nodes.
134
What is a Central Catalog?
A Central Catalog contains the entire catalog on a single node.
135
What are Local Transactions?
Local Transactions are transactions affecting only one node's data.
136
What is an Out-of-Date Replica?
An Out-of-Date Replica is not updated with the latest changes.
137
What is an Error Display in databases?
An *Error Display* advises resubmitting queries if **outdated**.
138
What is Operational Data?
Operational Data is used for daily business functions.
139
What is Analytic Data?
Analytic Data is used for business analysis and planning.
140
What is Volatility in data?
- Volatility refers to the **rate of change** in operational data. - Data volatility is about the information. It's about how often the *meaning of the data* changes.
141
What is Extracted Data?
Extracted Data is pulled *from operational databases* for processing.
142
What is Data Mart Derivation?
Data Mart Derivation is the creation of a data mart *from operational or warehouse* data.
143
What are Performance Issues?
Performance Issues are problems affecting the speed of database **queries**.
144
What is Data Refresh Frequency?
Data Refresh Frequency refers to regular intervals at which data is updated in the **warehouse**.
145
What is Query Conflict?
Query Conflict refers to *discrepancies in reports* generated by simultaneous queries.
146
What is Metric System Conversion?
Metric System Conversion is changing *measurements to metric* for consistency.
147
What is a system Catalog?
1. A collection of tables within the database that store **metadata** about the database itself. 2. Includes information about tables, columns, data types, constraints, and other _structural details_.
148
What is a Learning Management System?
A software application used to create, deliver, track, and manage online courses and training programs
149
What is a Temporary Database?
A Temporary Database is short-term storage for data processing.
150
What is Data Standardization?
Data Standardization applies _consistent_ formats and definitions to data.
151
What is a Fact Table?
A Fact Table stores quantitative data for **analysis**.
152
What is a Dimension Table?
A Dimension Table contains *descriptive attributes* related to facts.
153
What is Summary Data?
Summary Data is **precomputed** data stored for fast access.
154
What is a Dimensional Model?
A Dimensional Model is a structure used *to organize data* in a data warehouse.
155
What is Nonvolatile Data?
1. Nonvolatile Data remains **unchanged** (via power outage) until updated or replaced. 2. It's stored in non-volatile storage mediums, such as hard drives. flash drives, and optical discs
156
What is Query Submission?
Query Submission is the *process of requesting information* from a data warehouse.
157
What are Ad Hoc Queries?
Ad Hoc Queries are *spontaneous queries* for specific data analysis needs.
158
What are Operational Systems?
Operational Systems are designed for executing business transactions.
159
What is Analysis Paralysis?
Analysis Paralysis is overthinking that *hinders timely decision-making*.
160
What are Reporting and Analytics?
Reporting and Analytics are *processes for interpreting data*, to inform business decisions.
161
Who are BI Consumers?
BI Consumers are individuals *providing requirements* for data analysis.
162
What is Data Profiling?
Data Profiling is *assessing data quality* of candidate sources.
163
What are Dimensional Models?
Dimensional Models are structures designed for analytical data representation.
164
What is Data Extraction?
Data Extraction is the process of retrieving data from sources.
165
What is Data Cleansing?
Data Cleansing involves removing inaccuracies from data sets.
166
What is Data Replication?
Data Replication is copying data from one database to another.
167
What is *Change Data Capture*?
Change Data Capture is tracking changes in data sources.
168
What are Computational Requirements?
Computational Requirements are resources needed for *processing data _aggregations_.* Which includes: 1. The hardware (CPU, GPU, memory, storage) 2. software (operating system, databases, algorithms) that enable data manipulation and analysis
169
What are Parallel Machine Architectures?
Parallel Machine Architectures are systems designed to handle large data processing. ## Footnote [πŸ“–](https://share.evernote.com/note/0cf50478-e505-0a5a-80e9-7f3d6cc07738)
170
What is Storage Space Calculation?
Storage Space Calculation is determined by dimensions and their sizes.
171
What is Data Migration?
Data Migration is moving data from one system to another. ## Footnote [πŸ“–](https://share.evernote.com/note/b3bcc7fd-8a37-a648-7141-997bbb09f59f)
172
What is a Staging Area?
A Staging Area is infrastructure for _aggregating_ raw data sets.
173
What is Data Validation?
Data Validation ensures *data accuracy and quality* before processing. ## Footnote [πŸ“–](https://share.evernote.com/note/f7982682-682b-165e-9fc5-b809fa0c4cdf)
174
What is Data Cleaning?
Data Cleaning involves removing inaccuracies and inconsistencies from data.
175
What is Data Transformation?
Data Transformation modifies data into a **suitable format** for analysis.
176
What is *Data Requirements* Analysis?
Data Requirements Analysis *identifies data needed* for business intelligence applications.
177
What is Automated Extraction?
Automated Extraction refers to tools that *automate data retrieval processes*.
178
What are SQL Queries?
SQL Queries are structured queries for data **extraction**.
179
What are Adapters?
Adapters are connectors for different data sources. ## Footnote [πŸ“–](https://share.evernote.com/note/06c5adcb-2d22-08d6-4228-33a6624232d3)
180
What are Code Generation Techniques?
Code Generation Techniques are methods for creating executable *data extraction programs*.
181
What is a Source System?
A Source System is **the original** system from which data is extracted.
182
What is Warehouse Population?
*Warehouse Population* is filling the data warehouse with processed data.
183
What is Metadata?
Metadata is data that describes other data.
184
What is Renormalization?
Renormalization is reverting denormalized data back to normalized form.
185
What is Aggregation?
Aggregation is *summarizing data* for reporting or analysis.
186
What is Audit Information?
Audit Information is *data used to verify* integrity and accuracy.
187
What are Row Counts?
Row Counts refer to the total number of rows in a dataset.
188
What is Null Conversion?
Null Conversion transforms various null representations *into a standard form*.
189
What is Data Augmentation?
Data augmentation is a **statistical technique** that *creates new data from existing* data to add to data records.
190
What does *the architecture* of a database system describe?
The internal components and the relationships between components ## Footnote This includes understanding how different parts of the database system interact with each other.
191
What is the role of the **query processor** in a database system?
1. Interprets queries 2. creates a plan to *modify* the database or *retrieve data* 3. Returns query results to the application ## Footnote The query processor optimizes queries for efficiency.
192
How does the *storage manager function* in a database system?
**Translates** the query processor instructions *into low-level file system commands* that modify or retrieve data ## Footnote It also uses indexes to quickly locate data due to the varying sizes of databases.
193
What is the function of the transaction manager?
Ensures transactions are properly executed and **prevents conflicts** between concurrent transactions ## Footnote It also restores the database to a consistent state in case of failure.
194
What is contained in the log file of a database?
A complete record of all inserts, updates, and deletes processed by the database ## Footnote The transaction manager writes log records before applying changes for recovery purposes.
195
What is the purpose of the catalog in a database system?
A directory of tables, columns, indexes, and other database objects ## Footnote Other components use catalog information to process and execute queries.
196
True or False: All database systems support transactions.
False ## Footnote Some database systems do not support transactions and therefore lack a transaction manager.
197
Fill in the blank: The database sizes range from _______ to many terabytes.
megabytes
198
What does the transaction manager do in the event of a failure?
Uses log records to restore the database ## Footnote This ensures that the database can return to a consistent state after a failure.
199
What happens to a database transaction if a server failure occurs before completion?
The database must reverse any partial changes ## Footnote This is necessary to maintain data integrity.
200
What is query optimization in the context of a database?
The process of ensuring the most efficient instructions are executed on the data ## Footnote It is performed by the query processor.
201
What might vary significantly between different database systems?
Capabilities and component details ## Footnote For example, the physical structure of data on storage media affects the storage manager's implementation.
202
What are the main components of a database system?
A query processor, storage manager, transaction manager, log, and catalog.
203
What role does the query processor play in a database system?
It receives queries from applications and performs query optimization using information from the catalog.
204
How does the storage manager interact with the query processor?
It translates the query processor's instructions into file-system commands and uses an index to locate requested data quickly.
205
What does the transaction manager do in a database system?
It logs insert, update, and delete queries, sending the result back to the application.
206
Does the query processor has direct access to the database data on storage media.
No, all access to the database data must go through the storage manager
207
Without query optimization, the storage manager cannot $ ?
1. The storage manager can retrieve the data with *efficient or inefficient* instructions 2. Query optimization ensures the *most efficient instructions* are sent to the storage manager.
208
The catalog allows the storage manager to $ ?
Catalogs allow quick locations to the requested data.
209
What must be done in order to correctly restore database changes? 2
1. All inserts, updates, and deletes must be *logged in order* to correctly restore database changes. 2. Reads are not needed to restore database changes, however, and thus are not logged.
210
What is metadata in the context of a database?
Data about the database, such as column names and the number of rows in each table.
211
Where is metadata stored in a database system?
In the catalog.
212
The _______ helps locate data quickly by providing information about where on the storage media certain pieces of data can be found.
[index]
213
What types of queries does the transaction manager log?
Insert, update, and delete queries.
214
What is the purpose of query optimization?
To ensure the most efficient instructions are sent to the storage manager.
215
What is a **concurrent** execution?
1. Concurrent execution is when multiple processes are active at the same time, rather than one after the other. 2. It can also refer to the execution of multiple transactions at the same time
216
What is query optimization?
A crucial aspect of database management focused on enhancing the efficiency and speed of data retrieval operations. ## Footnote It involves techniques and strategies aimed at minimizing resources for executing queries while maximizing result quality.
217
What is the first step in the query optimization process?
The analysis of queries to understand their execution plans. ## Footnote Execution plans are the specific steps the database engine takes to retrieve requested data.
218
How does indexing contribute to query optimization?
Allows for quicker data access and reduces the amount of data the database must scan. ## Footnote Proper indexing speeds up response times significantly.
219
What are optimal join methods in query optimization?
Techniques for combining data from multiple tables, including: * Nested loops * Hash joins * Merge joins ## Footnote Selecting the appropriate join method based on data characteristics is vital for performance.
220
Why are statistics about data distribution and cardinality important?
They help the query optimizer make better decisions regarding execution plans. ## Footnote Keeping these statistics updated is essential for maintaining optimal performance as data evolves.
221
What is query rewriting?
A strategy where queries are reformulated to improve performance without altering the returned results. ## Footnote This can include simplifying expressions and consolidating multiple queries.
222
How do caching mechanisms enhance query performance?
By storing previously retrieved results to quickly return cached results instead of recalculating them. ## Footnote This saves time and resources during query execution.
223
What is the overall goal of query optimization?
To ensure that queries run as efficiently as possible, improving performance and user satisfaction in data-driven applications.
224
True or False: Query optimization involves only the analysis of queries.
False. ## Footnote Query optimization also involves techniques like indexing, join methods, query rewriting, and caching.
225
What is the memory hierarchy?
A structured arrangement of different types of memory in a computer system, organized by speed and cost.
226
What are registers in the memory hierarchy?
Small, fast storage locations within the CPU that hold the most frequently used data and instructions for quick access.
227
What is the function of the L1 cache?
Stores copies of frequently accessed data from main memory, allowing for quick retrieval.
228
What is the difference between L1 and L2 cache?
L1 is on-chip and smaller, while L2 is off-chip and larger, serving as an intermediary between L1 cache and main memory.
229
What is the primary storage area in a computer called?
Main Memory (DRAM).
230
What types of devices are considered local secondary storage?
Hard Drives (HDDs) and Solid State Drives (SSDs).
231
What is remote secondary storage?
Storage located on networks, often accessed via the internet, providing access to large amounts of data stored off-site.
232
Fill in the blank: Each level in the memory hierarchy provides a trade-off between _______ and cost.
speed.
233
What is the function of local secondary storage?
Holds data and programs that are not currently in use; slower than main memory.
234
What is the role of access patterns in memory hierarchy?
Programs are designed to access data in a way that takes advantage of the hierarchy, using faster memory for frequently accessed data.
235
What manages database connections and compiles SQL queries?
The query processor ## Footnote The query processor is responsible for managing connections and translating SQL into low-level storage engine instructions.
236
What authorizes a user to access a database?
A connection ## Footnote Connections are essential for user access to databases.
237
What is MySQL Workbench?
A desktop application that enables database administrators and users to interact with the database ## Footnote MySQL Workbench does not directly interact with the query processor.
238
How does the file system interact with the query processor?
Indirectly, via the storage engine ## Footnote The file system does not interact directly with the query processor.
239
Is the query processor aware of data addresses and structures on storage media?
False ## Footnote The query processor executes instructions compiled from SQL queries without knowledge of data addresses.
240
What does the storage engine execute?
Low-level instructions generated by the query processor from SQL queries ## Footnote This execution is crucial for the actual data operations within the database.
241
What do storage engines that support transactions manage?
Concurrency and recovery ## Footnote This management is essential for maintaining data integrity in transactional systems.
242
What are the five categories of utility programs?
Installation, client, administrative, developer, miscellaneous tools
243
How many tools do utility programs include approximately?
Approximately 30 tools
244
Who are the primary users of utility programs?
Database administrators or programmers
245
What is one example function of utility programs?
Upgrade existing databases to a new MySQL release ## Footnote Other functions include backing up databases, importing data, inspecting log files, and administering database servers.
246
Fill in the blank: Utility programs are intended primarily for _______.
[database administrators or programmers]
247
What are application programs built by?
Programmers using MySQL
248
Are application programs part of MySQL?
No
249
What is Workbench?
A desktop application with broad capabilities
250
What can Workbench do?
Assign data access privileges and monitor database performance ## Footnote Many Workbench capabilities are administrative functions.
251
What is mysql_config_editor?
A tool in the administrative group of utility programs
252
What does mysql_config_editor help database administrators do?
Configure MySQL
253
Are MySQL APIs older or newer than Connectors? ## Footnote A connection is a link between tools and the query processor.
Older
254
What are many Connectors built on top of?
The C API
255
What does the query parser do?
Checks each SQL statement for correct syntax and converts the statement to an internal representation ## Footnote The query parser is essential for ensuring that SQL commands are valid before execution.
256
What is the role of the connection manager?
Processes connection requests from the tools layer ## Footnote The connection manager is responsible for establishing links between the user applications and the database.
257
What information does a database connection include?
* Database name * Server address * Logon credentials ## Footnote These elements are necessary for successful communication with the database.
258
What is the cache in the context of database management?
An area of main memory that contains recent queries, execution plans, and results for possible reuse ## Footnote Caching improves performance by reducing the need to reprocess frequent queries.
259
What is the function of the cache manager?
Stores reusable information in main memory ## Footnote The cache manager optimizes access to frequently used data, enhancing overall system efficiency.
260
Fill in the blank: The cache is an area of main memory that contains recent queries, execution plans, and _______.
[results]
261
True or False: The connection manager handles the actual execution of SQL statements.
False ## Footnote The connection manager is involved in managing connections, not executing queries.
262
What does the file system layer consist of?
Data stored on storage media and organized in files ## Footnote The file system layer is integral to managing how data is stored and organized in a database.
263
What are the three types of data contained in a database file system?
* User data * Log files * Data dictionary ## Footnote These three types of data serve specific functions within the database management system.
264
What does user data include in a database?
* Tables * Indexes ## Footnote User data is crucial for storing actual information and facilitating efficient data retrieval.
265
What is the purpose of log files in a database?
To contain a detailed, sequential record of each change applied to a database ## Footnote Log files are essential for the recovery system to restore data after failures.
266
How does the recovery system utilize log files?
To restore data in the event of a transaction, system, or storage media failure ## Footnote Log files help ensure data integrity and reliability.
267
What is a catalog in the context of a relational database?
A directory of tables, columns, keys, indexes, and other objects ## Footnote The catalog, or data dictionary, is critical for managing database structure.
268
What term does MySQL use for its data dictionary?
'Data dictionary' ## Footnote MySQL's implementation includes a structured approach to managing metadata.
269
How many tables does the MySQL data dictionary contain?
Roughly 30 tables ## Footnote These tables provide comprehensive information about the database schema.
270
What does the 'tables' data dictionary table describe?
All tables ## Footnote This table provides metadata about the structure of each table in the database.
271
What information does the 'table stats' data dictionary table contain?
Table statistics, such as the number of rows in each table ## Footnote This information is useful for optimization and performance tuning.
272
What does the 'columns' data dictionary table describe?
All columns ## Footnote This table details the structure and attributes of each column in the database.
273
What information does the 'foreign keys' data dictionary table provide?
All foreign keys ## Footnote This table is important for understanding relationships between tables.
274
What does the 'indexes' data dictionary table describe?
All indexes ## Footnote Indexes are crucial for improving query performance.
275
What does the 'routines' data dictionary table contain?
All stored procedures and stored functions ## Footnote This table helps manage executable database logic.
276
What information does the 'triggers' data dictionary table describe?
All triggers ## Footnote Triggers are automated actions that occur in response to certain events in the database.
277
Can data dictionary tables be accessed directly with SELECT, INSERT, UPDATE, and DELETE queries?
No ## Footnote Direct access is restricted to maintain the integrity of the data dictionary.
278
How can the contents of data dictionary tables be accessed?
Indirectly ## Footnote This is done through specific commands like SHOW, which compile as SELECT queries.
279
What does the SHOW COLUMNS command generate?
A SELECT query against the columns table ## Footnote This command is used to retrieve column information for a specific table.
280
What type of query does CREATE generate against dictionary tables?
INSERT ## Footnote This reflects the addition of new objects to the database schema.
281
What type of query does ALTER generate against dictionary tables?
UPDATE ## Footnote This reflects modifications to existing database objects.
282
What type of query does DROP generate against dictionary tables?
DELETE ## Footnote This reflects the removal of objects from the database schema.
283
How can multiple nodes can be updated independently with local transactions?
1. The transaction coordinator notifies participating nodes of required updates. 2. Nodes commit immediately and confirm with the transaction coordinator. 3. If a node is unavailable, the transaction coordinator repeats the update message until confirmation is received.
284
What are updates done in a distributed transactions and in separate local transactions called?
1. Updates in separate local transactions are asynchronous. 2. Updates in a distributed transaction are synchronous.
285
What must most distributed databases always function as?
Partition-tolerant ## Footnote This means they can continue to operate even when there are network issues.
286
In a distributed database, what do consistency and availability guarantee?
Either consistency or availability, but not both ## Footnote This is a key principle in distributed database design.
287
What does a consistent database conform to?
All rules at all times ## Footnote Consistency ensures that data integrity is maintained across all nodes.
288
Give an example of a rule that may govern data in a distributed database.
Foreign key values on one node must match primary key values on another node ## Footnote This ensures relational integrity across nodes.
289
What must be true about copies of data on multiple nodes in a consistent database?
They must be identical ## Footnote This is crucial for maintaining data consistency.
290
What must live nodes do in an available database?
Respond to queries at all times ## Footnote This means that even if some nodes are down, the database remains operational.
291
What occurs when a network error prevents nodes from communicating?
A network partition forms ## Footnote This can disrupt the functioning of distributed databases.
292
Why do distributed databases occasionally experience network partitions?
Because nodes are connected by wide area networks that occasionally fail ## Footnote This vulnerability is inherent in distributed systems.
293
What characterizes a partition-tolerant database?
It continues to function when a network partition occurs ## Footnote This is essential for maintaining operational integrity in distributed systems.
294
What is the process for updating replicas in a database on a single node?
It is a simple process where either storage solutions handle replication internally or the database updates all replicas in a single local transaction.
295
What is required for synchronizing replicas in a database?
Synchronizing replicas does not necessitate any special database features.
296
What complicates the process of updating replicated data in a distributed database?
It is more complicated because updating all replicas in a distributed transaction is generally slower and fails if any replica is down.
297
What is the primary/secondary method in managing replicated data?
It assigns one node as the primary where all updates are made initially, and secondary nodes are updated afterward.
298
What happens if the primary node fails in the primary/secondary method?
The system automatically identifies a new primary node to maintain availability.
299
How does the group replication method work?
Updates can be applied to any node within a group, which shares transaction details with other nodes before committing.
300
What occurs if a conflict arises during the group replication method?
An algorithm decides which transaction will proceed and which will be rolled back.
301
What can influence the complexity of the algorithm used in group replication?
It can range from simple, allowing the first transaction to commit, to more complex solutions.
302
What happens during a network partition in group replication?
Processing will be paused temporarily.
303
How does participation activity differ among relational databases?
Support for primary/secondary and group replication methods can differ.
304
Which database supports both primary/secondary and group replication methods?
MySQL with the InnoDB storage engine.
305
What advantage does primary/secondary replication provide for read queries?
Each node can handle read queries independently without needing to connect with other nodes across the wide area network.
306
What is the process of reading data from operational systems and writing it to a temporary database called?
'extraction' ## Footnote This is the initial step in the data processing pipeline.
307
What is 'cleansing' in the context of data management?
Correcting errors in operational data and converting to a standard format ## Footnote Cleansing ensures data quality and consistency.
308
What issue arises from data in different operational systems?
Incompatible or missing keys ## Footnote This can complicate data integration efforts.
309
What is one example of data integration?
Creating uniform primary and foreign keys ## Footnote This facilitates consistent data relationships across systems.
310
What is the term for converting data from a design optimized for operations to a design optimized for analytics?
'restructuring' ## Footnote Restructuring is crucial for effective data analysis.
311
What do dimensional designs in data warehouses usually include?
Date and time dimension tables ## Footnote These tables are essential for tracking historical data.
312
How many rows would the date dimension table contain if an organization tracks data for 100 years?
36,500 rows ## Footnote This is calculated as 100 years x 365 days per year.
313
How many rows does the time dimension table contain?
1,440 rows ## Footnote This is calculated as 24 hours x 60 minutes per hour.
314
What do fact tables contain in relation to date and time dimensions?
Foreign keys referencing date, time, or both dimensions ## Footnote This establishes the time of a fact.
315
What is a dimension hierarchy?
A sequence of columns where each column has a one-many relationship to the next column ## Footnote This structure helps in organizing data in a meaningful way.
316
What is a data lake?
An analytic database of raw, unprocessed data copied from multiple data sources. ## Footnote Data lakes share characteristics of both data warehouses and federated databases.
317
What role does the global query processor play in a federated database?
It decomposes federated queries and merges the results into a federated result. ## Footnote Individual databases process decomposed queries independently.
318
When is referential integrity enforced in InfoSphere?
When primary and foreign keys are in the same database. ## Footnote Referential integrity is not enforced when they are in different databases.
319
What is a federated database?
A collection of two or more participating databases under a coordinating software layer. ## Footnote The participating databases are autonomous and heterogeneous.
320
Define an autonomous database.
A database that operates independently of other participating databases. ## Footnote It can be administered and queried as if it were not part of a federated database.
321
What does heterogeneous mean in the context of databases?
Databases that run under different database systems or have incompatible schema. ## Footnote This may include inconsistent primary and foreign keys or similar tables with different designs.
322
What is the function of middleware in a federated database?
Middleware serves as the coordinating software layer between application programs and database software. ## Footnote Many federated database middleware products are available.
323
Name two examples of federated database middleware products.
* InfoSphere Federated Server from IBM * WebLogic Server from Oracle
324
What is a global catalog in a federated database?
A directory of participating database objects, such as tables, columns, and indexes. ## Footnote It is one of the components found in most federated database middleware products.
325
What does a database wrapper do?
Converts decomposed queries to the appropriate syntax for each participating database.
326
True or False: A federated database can include databases with compatible schemas only.
False ## Footnote Federated databases may include heterogeneous databases with incompatible schemas.
327
What is the typical nature of web click data?
Rapidly collected and high volume
328
How is social media data typically characterized?
Unstructured
329
What type of data analysis is often appropriate for high-volume, unstructured data?
Data lake supported by data scientists
330
What solution can make three operational databases appear as one?
Federated database
331
What types of data are usually highly structured and suitable for fact tables?
Sales revenue and manufacturing costs
332
What do analysis of revenue and cost by product line and region suggest?
Dimension tables
333
What is required for quarterly reports?
Periodic extracts at a fixed period
334
What do the requirements for periodic extracts suggest?
Data warehouse or data mart
335
What can easily handle data that is local to one software layer and not persisted on storage media?
Embedded database
336
When is an embedded database not appropriate?
If the data must be shared or persisted