1. Scalar Subquery 2. Row Subquery 3. Table Subquery

DATABASE - FINALS Flashcards by Jholan Ligon

allows you to create database and table structures, perform basic data management chores (add, delete and modify) and perform complex queries designed to transform raw data into useful information.

Database Language

How well did you know this?

Not at all

Perfectly

must perform such basic functions with minimal user effort, and its command structure and syntax must be easy to learn.

Database Language

How well did you know this?

Not at all

Perfectly

must be portable; that is, it must conform to some basic standard, so a person does not have to relearn the basics when moving from one RDBMS to another.

Structured Query Language

How well did you know this?

Not at all

Perfectly

SQL Major Components:

Data Definition Language (DDL)
Data Manipulation Language (DML)
Transaction Control Language(TCL)
Data Control Language(DCL

How well did you know this?

Not at all

Perfectly

SQL includes commands to create database objects such as tables, indexes, and views, as well as commands to define access rights to those database objects.

Data Definition Language (DDL)

How well did you know this?

Not at all

Perfectly

SQL includes commands to insert, update, delete and retrieve data within the database tables.

Data Manipulation Language (DML)

How well did you know this?

Not at all

Perfectly

DML commands in SQL are executed within the context of a transaction, which is logical unit of work composed of one or more SQL statements, as defined by business rules.

Transaction Control Language (TCL)

How well did you know this?

Not at all

Perfectly

SQL provides commands to control the processing of these statements an indivisible unit of work.

Transaction Control Language (TCL)

How well did you know this?

Not at all

Perfectly

commands are used to control access to data objects, such as giving a one user permission to only view the certain table, and giving another use permission to change the data in the given table.

Data Control Language (DCL)

How well did you know this?

Not at all

Perfectly

it groups the data from the SELECT table(s) and produces a single summary row for each group.

Group by Clause

How well did you know this?

Not at all

Perfectly

is used to combine records from two or more tables in a database.

SQL Joins

How well did you know this?

Not at all

Perfectly

based on a common field between them other keywords are combined with the SELECT statement, these keywords are:

INNER JOIN
OUTER JOIN
- Left Join
- Right Join
- Full Join

How well did you know this?

Not at all

Perfectly

Inner Join keyword

EQUIJOIN

How well did you know this?

Not at all

Perfectly

return rows when there is atleast one match in both tables

Inner Join or Equijoin

How well did you know this?

Not at all

Perfectly

is an extension of Inner Join. but it does not require each record in the two joined tables to have a matching record.

Outer Join

How well did you know this?

Not at all

Perfectly

the joined table retains each record - even if no other matching record exists.

Outer Join

How well did you know this?

Not at all

Perfectly

Outer joins subdivide further into:

left outer joins
right outer joins
full outer joins

How well did you know this?

Not at all

Perfectly

returns all rows from the left table, even if there are no matches in the right table

Left Join

How well did you know this?

Not at all

Perfectly

returns all the rows from the right table, even if there are no matches in the left table

Right Join

How well did you know this?

Not at all

Perfectly

both tables are secondary(or optional), such that if rows are being matched in table A and table B, then all rows from table A are displayed even if there is no matching row in table B, and vice versa

Outer Join

How well did you know this?

Not at all

Perfectly

returns all possible combinations of rows from the two tables

Cross Join

How well did you know this?

Not at all

Perfectly

SELECT statement embedded within another SELECT statement

Subqueries

How well did you know this?

Not at all

Perfectly

the results of this inner SELECT statement (or subselect) are used in the outer statement to help determine the contents of the final result.

Subqueries

How well did you know this?

Not at all

Perfectly

Types of Subqueries

Scalar Subquery
Row Subquery
Table Subquery

How well did you know this?

Not at all

Perfectly

returns a single column and a single row, that is, a single value

Scalar Subquery

returns multiple columns but only a single row

Row Subquery

returns one or more columns and multiple rows

Table Subquery

used to improve the efficiency of searches and to avoid duplicate column values. can be created on the basis of any selected attribute.

SQL Indexes

used to delete index

Drop Index

the syntax of Drop Index

DROP INDEX indexname;

a virtual table base on a SELECT query

View

the query can contain columns, computed columns, aliases and aggregate functions from one or more tables.

View

syntac for create view

CREATE VIEW viewname AS SELECT query

data definition command that stores the subquery specification

CREATE VIEW

used to generate the virtual table - in the data dictionary.

SELECT statement

is the keyword to modify a created view

ALTER VIEW

is used to delete a view that was previously created

DROP VIEW

a system that provides for data collection, storage and retrieval; facilitates the transformation of data into information; and manages both data and information.

Information System

An information system is composed of:

- hardware -DBMS and other software -people -procedures

is the process that establishes the need for an information system and its extent

System Analysis

is the process of creating an information system

System Development

Generating Information for decision making:

1. Data 2. Appplication Code 3. Information 4. Decisions

The performance of an information system depends on three factors:

1. Database design and implementation 2. Application design and implementation 3. Administrative procedures

is the process of database design and implementation

Database Development

is to create complete, normalized, no redundancy (to the greatest extent possible), and fully integrated conceptual, logical and physical database models.

Database Design

this phase includes creating the database storage structure, loading data into the database and providing for data management.

Implementation Phase

is a cycle that traces the history of an information system.

System Development Life Cycle (SDLC)

provides the big picture within which database design and application development can be mapped out and evaluated

System Development Life Cycle (SDLC)

Phases of System Development Life Cycle (SDLC)

- Plannin - Analysis - Detailed Systems Design - Implementation - Maintenance

phase yields a general overview of the company and its objectives

Planning

problems defined during the planning phase are examined in greater detail during the ----

Analysis Phase

the designer completes the design of the system's processes

Detailed Systems Design

the design includes all the necessary technical specifications for the screens, menu, reports and other devices that might help make the system a more efficient information generator.

Detailed Systems Design

the hardware, the DBMS software and application programs are installed and the database design is implemented

Implementation

during the initial stages of ---, the system enters to a cycle of coding, testing and debugging until it is ready to be delivered

Implementation

the actual database is created, and the system is customized by the creation of table views, user authorization and so on.

Implementation

The system is in full operation at the end of this phase, but it will be continuously evaluated and fine-tuned

Implementation

almost as soon as the system is operational, end users begin to request changes in it.

Maintenance

Three types of Maintenance

1. Corrective Maintenance 2. Adaptive Maintenance 3. Perfective Maintenance

a cycle that traces the history of a database within an information system

Database Life Cycle

6 Phases of Database Life Cycle

1. Database Initial Study 2. Database Design 3. Implementation and loading 4. Testing and evaluation 5. Operation 6. Maintenance and evolution

the designers job is to make sure that his or her database system objectives correspond to those envisioned by the end user.

Database Initial Study

It focuses on the design of the database model that will support company operations and objectives.

DATABASE DESIGN

The most critical DBLC phase since it will make sure that the final product meets the user and system requirements.

DATABASE DESIGN

Two views of the data within the systems (Database Design)

1. Business View 2. Designer's View

This includes creation of tables, attributes, domains, views, indexes, security constraints and storage and performance guidelines

IMPLEMENTATION AND LOADING

is a technique that creates logical representations of computing resources that are independent of the underlying physical computing resources.

Virtualization

The DBA test and fine-tunes the database to ensure that is performed as expected.

TESTING AND EVALUATION

Database backups can be performed in different levels:

1. Full Backup 2. Differential Backup 3. Transaction Log Backup

dump, of the entire database. In this case, all database objects are backed up in their entirety.

Full Backup

which only the objects that have been updated or modified since the last full backup are backed up.

Differential Backup

which backs up only the transaction log operations that are not reflected in a previous backup copy of the database.

Transaction Log Bacup

The database, its management, its users, and its application program constitute a complete information system.

OPERATION

The database administrator must be prepared to perform routine maintenance activities within the database.

MAINTENANCE AND EVOLUTION

It is the first stage in the database design process.

Conceptual Design

A process that uses data modelling techniques to create a model of a database structure that represents real world objects as realistically as possible.

Conceptual Design

Conceptual Design Data Rule:

“All that is needed is there, and all that is there is needed.”

Conceptual Design Steps

1. Data Analysis and requirements 2. Entity relationship modeling and normalization 3. Data model verification 4. Distributed database design

The first step in conceptual design is to discover the characteristics of the data elements

DATA ANALYSIS AND REQUIREMENT

The process of defining business rules and developing the conceptual model using ER diagrams can be described using the steps shown below:

Entity Relationship Modeling and Normalization

one of the last steps in the conceptual design stage, and it is one of the most critical.

Data Model Verification

If the database data and processes will be distributed across the system, portions of a database, known as database fragments, may reside in several physical locations.

Distributed Database Design

is a subset of a database that is stored at a given location. The database fragment may be a subset of rows or columns from one or multiple tables.

Database Fragment

Factors that affect the purchasing decision vary from company to company,

DBMS Software Selection

is the second stage in the database design process.

Logical Design

is the process of determining the data storage organization and data access characteristics of the database to ensure its integrity, security, and performance. This is the last stage in the database design process.

Physical Design

is any action that reads from or writes to a database.

transaction

A sequence of database requests that accesses the database.

Transaction

is a logical unit of work; that is, it must be entirely completed or aborted—no intermediate ending states are accepted.

Transaction

ACID

Atomicity Consistency Isolation Durabilit

requires that all operations (SQL requests) of a transaction be completed; if not, the transaction is aborted.

Atomicity

indicates the permanence of the database’s consistent state.

Consistency

means that the data used during the execution of a transaction cannot be used by a second transaction until the first one is completed

Isolation

ensures that once transaction changes are done and committed, they cannot be undone or lost, even in the event of a system failure.

Durability

Transaction support is provided by two SQL statements:

COMMIT and ROLLBACK

statement is reached, in which case all changes are permanently recorded within the database. The COMMIT statement automatically ends the SQL transaction.

COMMIT statement

statement is reached, in which case all changes are aborted and the database is rolled back to its previous consistent state.

ROLLBACK statement

to keep track of all transactions that update the database.

transaction log

A DBMS feature that coordinates the simultaneous execution of transactions in a multiprocessing database system while preserving data integrity

Concurrency Control

is to ensure the serializability of transactions in a multiuser database environment.

Concurrency Control

l is important because the simultaneous execution of transactions over a shared database can create several data integrity and consistency problems.

Concurrency Control

Three Main Problems of Concurrency Control

Lost Updates Uncommitted Data Inconsistent Retrieval

Problem occurs when two concurrent transactions, T1 and T2, are updating the same data element and one of the updates is lost (overwritten by the other transaction).

Lost Updates

It occurs when two transactions, T1 and T2, are executed concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has already accessed the uncommitted data—thus violating the isolation property of transactions.

Uncommitted Data

It occur when a transaction accesses data before and after one or more other transactions finish working with such data.

Inconsistent Retrieval

is a special DBMS process that establishes the order in which the operations are executed within concurrent transactions.

Scheduler

interleaves the execution of database operations to ensure serializability and isolation of transactions

Scheduler

one of the most common techniques used in concurrency control because they facilitate the isolation of data items used in concurrently executing transactions.

Locking methods

guarantees exclusive use of a data item to a current transaction.

lock

The use of locks based on the assumption that conflict between transactions is likely is usually referred to as

Pessimistic Locking

All lock information is handled by a -----, which is responsible for assigning and policing the locks used by the transactions.

Lock Manager

is issued when a transaction requests permission to update a data item and no locks are held on that data item by any other transaction.

Exclusive Lock

An ---- does not allow other transactions to access the database.

Exclusive Lock

A --- lock allows other read only transactions to access the database

Shared Lock

is a condition in which only one transaction at a time can own an exclusive lock on the same object.

Mutual Exclusive Rule

approach to scheduling concurrent transactions assigns a global, unique time stamp to each transaction.

Time Stamping

Time stamps must have two properties:

Uniqueness Monotonicity

ensures that no equal time stamp values can exist

Uniqueness

ensures that time stamp values always increase.

Monotonicity

Two Schemes for Time Stamping Method:

Wait/Die Scheme Wound/Wait Scheme

A concurrency control scheme in which an older transaction must wait for the younger transaction to complete and release the locks before requesting the locks itself. Otherwise, the newer transaction dies and is rescheduled

Wait/Die Scheme

A concurrency control scheme in which an older transaction can request

Wound/Wait Scheme

is based on the assumption that the majority of database operations do not conflict.

Optimistic Approach

approach requires neither locking nor time stamping techniques.

Optimistic Approach

Using an optimistic approach, each transaction moves through two or three phases, referred to as ---, ----, and ----

read, validation, and write

During the --- phase, the transaction reads the database, executes the needed computations, and makes the updates to a private copy of the database values.

Read Phase

During the --- phase, the transaction is validated to ensure that the changes made will not affect the integrity and consistency of the database.

Validation Phase

During the --- phase, the changes are permanently applied to the database

Write Phase

defines transaction management based on transaction isolation levels.

ANSI SQL standard

refer to the degree to which transaction data is “protected or isolated” from other concurrent transactions.

Transaction isolation levels

Types of Read Operation

Dirty read Nonrepeatable read Phantom read

a transaction can read data that is not yet committed

Dirty read

a transaction reads a given row at time t1, and then it reads the same row at time t2, yielding different results. The original row may have been updated or deleted.

Nonrepeatable read

a transaction executes a query at time t1, and then it runs the same query at time t2, yielding additional rows that satisfy the query.

Phantom read

will read uncommitted data from other transactions.

Read Uncommitted

At this isolation level, the database does not place any locks on the data, which increases transaction performance but at the cost of data consistency.

Read Uncommitted

forces transactions to read only committed data.

Read Committed

isolation level ensures that queries return consistent results. This type of isolation level uses shared locks to ensure other transactions do not update a row after the original query reads

Repeatable Read

restores a database from a given state (usually inconsistent) to a previously consistent state.

Database recovery

all portions of the transaction must be treated as a single, logical unit of work in which all operations are applied and completed to produce a consistent database.

Atomic Transaction Property

Four important concepts that affects the recovery process:

1. Write-ahead-log Protocol 2. Redundant transaction logs 3. Database buffers 4. Database checkpoints

ensures that transaction logs are always written before any database data is actually updated.

Write-ahead-log Protocol

(several copies of the transaction log) ensure that a physical disk failure will not impair the DBMS’s ability to recover data

Redundant transaction logs

are temporary storage areas in primary memory used to speed up disk operations

Database buffers

are operations in which the DBMS writes all of its updated buffers in memory (also known as dirty buffers) to disk.

Database checkpoints

refers to a set of activities and procedures designed to reduce the response time of the database system—that is, to ensure that an end-user query is processed by the DBMS in the minimum amount of time

Database performance tuning

the objective is to generate a SQL query that returns the correct answer in the least amount of time, using the minimum amount of resources at the server end.

Client Side

On the client side, the objective is to generate a SQL query that returns the correct answer in the least amount of time, using the minimum amount of resources at the server end. The activities required to achieve that goal are commonly referred to as ----

SQL performance tuning

the DBMS environment must be properly configured to respond to clients’ requests in the fastest way possible, while making optimum use of existing resources.

Server Side

On the server side, the DBMS environment must be properly configured to respond to clients’ requests in the fastest way possible, while making optimum use of existing resources.

DBMS performance tuning

It is the component of a database management system that attempts to select the most efficient way to execute a query.

Query Optimizer

It determines the most efficient way to execute a SQL statement

Query Optimizer

The optimizer generates a set of potential plans for the SQL statement based on available access paths and hints.

Query Optimizer

refers to a set of data that displays the characteristics of volume, velocity, and variety (the 3 Vs) to an extent that makes the data unsuitable for management by a relational database management system

Big Data

Three Characteristics of Big Data:

1. Volume 2. Velocity 3. Variety

the quantity of data to be stored

Volume

the speed at which data is entering the system

Velocity

the variations in the structure of the data to be stored

Variety

According to --- 2018, The amount of data we produce every day is truly mind- boggling. There are 2.5 quintillion bytes of data created each day at our current pace, but that pace is only accelerating with the growth of the Internet of Things (IoT).

Forbes

According to Forbes 2018, The amount of data we produce every day is truly mind- boggling. There are ----- of data created each day at our current pace, but that pace is only accelerating with the growth of the Internet of Things (IoT).

2.5 quintillion bytes

According to Statista.com 2023 an approximately ==== of data are created each day.

328.77 million terabytes

Two options for handling Big Data

Scale up Scale out

meaning we keep the same number of systems to store and process data, but migrate each system to a larger system.

Scale up

meaning we increase the number of systems, but do not migrate to larger systems

Scale Out

refers to the speed at which data is entered into a system and must be processed.

Velocity

refers to the vast array of formats and structures in which the data can be captured.

Variety

Types of Big Data Variety:

Structured Data Semi-structured Data

It is organized, tagged and easily searchable, often stored in traditional data

Structured Data

This type of data contains some structured elements but lacks a rigid structure.

Semi-structured Data

is a Java-based framework for distributing and processing very large data sets across clusters of computers.

Hadoop

If you have a huge file, the file will be broken down into smaller chunks and stored in various machines . In breaking the file, it will also make copies of it which goes in different nodes, this way big data is stored in a distributed way.

Hadoop Distributed File System.

Instead of using one large computer to store and process the data, Hadoop allows clustering multiple computers to analyze massive datasets in parallel more quickly.

Map Reduce

are non-tabular databases and store data differently than relational tables.

NoSQL

Four Category of NoSQL:

1.Key-value database 2.Document database 3.Column-oriented database 4.Graph database

It is the simplest of the NoSQL data models. It stores data as a collection of key-value pairs. The key acts as an identifier for the value.

Key-value Database

can refer to traditional, relational database technologies that use column centric storage instead of row-centric storage

Column-oriented Database

More efficient for optimizing read operations to store the data in relational tables not per row, but per column.

Column-oriented Database

are based on graph theory and represent data through nodes, edges, and properties.

Graph Database

are conceptually similar to key-value databases, and they can almost be considered a subtype of KV databases.

Document-oriented Database

DATABASE - FINALS Flashcards

(179 cards)