Query Processing Flashcards

(83 cards)

1
Q

is a request posed to a database
(or data system) for data retrieval,
updating, deletion, insertion, etc.

A

Query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

is the stage in a Database Management System (DBMS) where the system interprets and executes
a user’s query (usually written in SQL) to retrieve or modify data efficiently from the database

A

Query Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

It ensures all table names, columns, and keywords are valid.

A

Parsing and Translation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The query is then translated into an internal form (like relational
algebra).

A

Parsing and Translation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

finds the most efficient way to execute the query.

A

Query Optimizer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

It considers multiple query plans (e.g., using indexes, join
methods, or sorting strategies).

A

Optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The goal is to minimize cost, such as CPU time, disk I/O, and memory use

A

Optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

runs the optimized query plan

A

Execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

It retrieves or modifies the data from the database storage

A

Execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Checks SQL correctness.

A

Parsing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Chooses whether to use an index on age or scan the whole table

A

Optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Retrieves matching records and returns the name field.

A

Execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

refer to the factors a DBMS (Database Management System) uses to estimate how expensive a query plan is

A

Measures of Query Cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

to decide the most efficient way to run a query

A

Measures of Query Cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Most important factor in cost estimation

A

Disk I/O Cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Measures how many disk accesses (reads/writes) are needed

A

Disk I/O Cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Measures the time spent by the processor

A

CPU Cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Usually smaller than disk I/O, but becomes important for in-memory
databases.

A

CPU Cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Measures how much RAM is used during query execution.

A

Memory Usage (Buffer Cost)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Affects how much data can be processed in memory without writing to disk

A

Memory Usage (Buffer Cost)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

High memory use may cause paging or spilling to disk, increasing
total cost

A

Memory Usage (Buffer Cost)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

data may be stored on multiple servers.

A

Communication Cost (in Distributed Databases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Critical in distributed query processing and cloud databases.

A

Communication Cost (in Distributed Databases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The total time taken from query submission to result delivery.

A

Query Response Time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
It’s the end-user perspective measure.
Query Response Time
25
Accessing data on disk
Disk I/O Cost
26
Processing tuples and computations
CPU Cost
27
Space used for intermediate results
Memory Cost
28
Data transfer between servers
Communication Cost
29
Total elapsed time for query result
Query Response Time
30
is a relational algebra operation used to choose rows (tuples) from a table (relation) that satisfy a specific condition (predicate).
selection operation
31
It’s one of the most basic and essential operations in query processing.
selection operation
32
Combines multiple conditions using AND (∧), OR (∨), or NOT (¬)
Compound Selection
33
Uses comparison operators like =, ≠, <, ≤, >, ≥.
Theta (θ) Selection
34
is the step where the DBMS arranges tuples (rows) of a relation (table) in a specific order — usually ascending or descending — based on one or more attributes (columns).
Sorting
35
It is a key operation used in many SQL queries and internal DBMS processes.
Sorting
36
The DBMS divides the data into chunks (runs) that fit into main memory.
Sort Phase
37
Each run is sorted in memory using an internal sorting algorithm (like quicksort).
Sort Phase
38
The sorted runs are then written back to disk.
Sort Phase
39
The sorted runs are merged together into larger sorted files.
Merge Phase
40
The process continues until one fully sorted file remains.
Merge Phase
41
is one of the most important — and often most expensive — operations in query processing.
join operation
42
It combines tuples (rows) from two or more relations (tables) based on a related attribute between them
join operation
43
A theta join where the condition uses equality (=) only.
Equi-Join
44
Automatically joins tables based on common attribute names.
Natural Join
45
Keeps all tuples from left table.
Left Outer Join
46
Keeps all tuples from right table.
Right Outer Join
47
Keeps all tuples from both tables, filling missing values with NULLs
Full Outer Join
48
For every tuple in relation R, compare it with every tuple in S.
Nested-Loop Join
49
Used when: Relations are small or one fits in memory.
Nested-Loop Join
50
Processes one block of R at a time and compares it with all blocks of S.
Block Nested-Loop Join
51
Used when: One relation is much smaller than the other.
Block Nested-Loop Join
52
Used when: Join attribute in one relation is indexed.
Indexed Nested-Loop Join
53
Both relations are sorted on the join attribute.
Sort-Merge Join
54
Used when: Both relations are already sorted or can be sorted easily.
Sort-Merge Join
55
divides both R and S into buckets based on the join attribute
Hash Join
56
Used when: Join attribute is not sorted or indexed.
Hash Join
57
refers to how the database system executes a relational algebra expression (or an SQL query) to produce the final result efficiently
evaluation of expression
58
It’s the stage where the DBMS translates the logical query (what to do) into a physical plan (how to do it)
evaluation of expression
59
is a combination of relational algebra operations
query expression
60
Each operation produces an intermediate table stored temporarily
Materialized Evaluation
61
Output of one operation is passed directly as input to the next
Pipelined Evaluation
62
efers to the process of executing database queries directly in main memory (RAM) instead of relying primarily on slower disk-based storage
IN MEMORY QUERY PROCESSING
63
This approach is central to modern high-performance DBMSs, especially in memory databases such as SAP HANA, Redis, and MemSQL.
IN MEMORY QUERY PROCESSING
64
store data on disk and move it into memory buffers only when needed.
Traditional DBMSs
65
Data is stored in main memory (RAM) instead of disk
Data Storage
66
Usually column-oriented (columnar) rather than row-oriented for faster analytics.
Data Storage
67
Data is periodically backed up to disk for durability
Data Storage
68
Queries are processed entirely in memory, using CPU caches and vectorized operations
Query Execution
69
The query engine uses algorithms optimized for memory (e.g., in memory joins, scans).
Query Execution
70
Uses write-ahead logging (WAL) or checkpointing to ensure recovery in case of failure.
Transaction Management
71
ACID properties are still maintained
Transaction Management
72
SQL query is parsed and validated
Query Parsing
73
Plan is generated and optimized
Query Optimization
74
Operators run directly on memory data structures
Query Execution
75
Results returned to user
Result Output
76
Uses cost models for memory, not disk
Query Optimization
77
Uses memory-resident indexes and vectorized execution
Query Execution
78
Minimal delay since data is already in memory
Result Output
79
dominant cost (since disk I/O is minimal).
CPU Time
80
how fast data can be transferred between memory and CPU.
Memory Bandwidth
81
how efficiently CPU cache is used.
Cache Utilization
82
multiple queries sharing memory resource
Concurrency Overhead