Transactions, Concurrency, & Isolation Flashcards

(75 cards)

1
Q

What is a transaction in Databricks?

A

A transaction is a sequence of operations performed as a single logical unit of work, ensuring data integrity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False: Databricks supports ACID transactions.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does ACID stand for in the context of transactions?

A

Atomicity, Consistency, Isolation, Durability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fill in the blank: In Databricks, ________ ensures that transactions are processed reliably.

A

ACID properties

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is concurrency control?

A

Concurrency control is the management of simultaneous operations without conflicting with each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which isolation level in Databricks prevents dirty reads?

A

Read Committed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multiple Choice: Which of the following is NOT an isolation level supported by Databricks? A) Serializable B) Read Uncommitted C) Read Committed D) Snapshot

A

B) Read Uncommitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the default isolation level in Databricks?

A

Serializable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or False: Snapshot isolation allows for reading the last committed version of data.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the purpose of optimistic concurrency control in Databricks?

A

To allow multiple transactions to proceed without locking resources, checking for conflicts before committing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fill in the blank: In Databricks, ________ provides a mechanism to manage concurrent access to data.

A

Concurrency control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What happens during a rollback in a Databricks transaction?

A

All changes made during the transaction are undone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Multiple Choice: Which of the following can lead to a deadlock? A) Two transactions locking the same resource B) One transaction waiting for a resource held by another C) Both A and B

A

C) Both A and B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the result of a dirty read?

A

Reading uncommitted changes made by another transaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True or False: Serializable isolation level can lead to increased contention.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the term ‘phantom read’ refer to?

A

When a transaction reads a set of rows that satisfy a condition, but another transaction inserts or deletes rows, causing the first transaction to see different data if it re-queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fill in the blank: In Databricks, a ________ is a logical grouping of one or more SQL statements.

A

transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the purpose of the ‘BEGIN TRANSACTION’ statement?

A

To initiate a new transaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Multiple Choice: Which isolation level allows for non-repeatable reads? A) Read Committed B) Serializable C) Snapshot D) Read Uncommitted

A

A) Read Committed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does the term ‘commit’ mean in a transaction?

A

To make all changes made during the transaction permanent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

True or False: Isolation levels can be changed during a transaction.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the difference between optimistic and pessimistic concurrency control?

A

Optimistic assumes conflicts are rare, while pessimistic locks resources to prevent conflicts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Fill in the blank: A ________ is a sequence of operations that can be rolled back if not completed successfully.

A

transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What happens during a commit in a Databricks transaction?

A

All changes made during the transaction are saved and made visible to other transactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Multiple Choice: Which isolation level is the strictest? A) Read Committed B) Serializable C) Snapshot D) Read Uncommitted
B) Serializable
26
What is a 'write skew'?
A phenomenon where two transactions read the same data and then write different values, leading to inconsistencies.
27
True or False: Isolation levels can have an impact on performance.
True
28
What is the purpose of the 'ROLLBACK' statement?
To undo changes made during the current transaction.
29
Fill in the blank: In Databricks, ________ ensures that transactions are executed in a way that maintains data integrity.
isolation
30
What is 'serializability' in the context of transactions?
A property that ensures transactions can be executed in a sequence that produces the same result as if they were executed one after the other.
31
Multiple Choice: Which of the following is a characteristic of Serializable isolation? A) Allows dirty reads B) Prevents phantom reads C) Allows non-repeatable reads D) None of the above
B) Prevents phantom reads
32
What is the impact of using a higher isolation level?
Increased data integrity but potentially lower performance due to higher contention.
33
True or False: Databricks uses a multi-version concurrency control (MVCC) approach.
True
34
What does the term 'conflict' mean in concurrency control?
When two transactions interfere with each other by trying to access the same resource.
35
Fill in the blank: A transaction that does not follow ACID properties is considered ________.
unsafe
36
What is a 'lock' in the context of concurrency control?
A mechanism to prevent other transactions from accessing a resource while it is being modified.
37
Multiple Choice: Which type of lock allows read access but prevents write access? A) Shared lock B) Exclusive lock C) Update lock D) None of the above
A) Shared lock
38
What is the difference between a 'shared lock' and an 'exclusive lock'?
A shared lock allows multiple transactions to read a resource, while an exclusive lock prevents any other transaction from accessing it.
39
True or False: Locks can lead to deadlocks if not managed properly.
True
40
What is the purpose of a 'timeout' in a transaction?
To limit the duration a transaction can hold locks before it is forcibly rolled back.
41
Fill in the blank: The ________ isolation level allows transactions to read the last committed state of the data without blocking.
Snapshot
42
What is an 'abort' in a transaction?
An action that causes the transaction to be terminated without committing any changes.
43
Multiple Choice: Which isolation level provides the highest performance but the lowest consistency? A) Read Committed B) Serializable C) Snapshot D) Read Uncommitted
D) Read Uncommitted
44
What is the role of a transaction log in Databricks?
To record all changes made during transactions for recovery and auditing purposes.
45
True or False: Using higher isolation levels can lead to increased contention and reduced throughput.
True
46
What does 'eventual consistency' mean?
A model where updates to data will propagate and become consistent over time, rather than immediately.
47
Fill in the blank: In Databricks, ________ is used to manage the order of transaction execution.
scheduling
48
What are 'transaction conflicts'?
Situations where two or more transactions interfere with each other, leading to inconsistent results.
49
Multiple Choice: Which isolation level allows the highest degree of concurrency? A) Serializable B) Snapshot C) Read Committed D) None of the above
B) Snapshot
50
What is the purpose of the 'set transaction' command?
To specify the isolation level for the current transaction.
51
True or False: Isolation levels can be set at the session level in Databricks.
True
52
What is a 'write conflict'?
When two transactions attempt to write to the same resource simultaneously.
53
Fill in the blank: The ________ isolation level avoids dirty reads and non-repeatable reads.
Read Committed
54
Multiple Choice: Which of the following isolation levels prevents all anomalies? A) Read Uncommitted B) Read Committed C) Serializable D) Snapshot
C) Serializable
55
What is 'two-phase locking'?
A concurrency control method where a transaction must acquire all its locks before it can release any.
56
True or False: Two-phase locking guarantees serializability.
True
57
What is the impact of isolation levels on data consistency?
Higher isolation levels increase data consistency but may reduce concurrency and performance.
58
Fill in the blank: The ________ isolation level allows for both dirty reads and non-repeatable reads.
Read Uncommitted
59
Multiple Choice: Which isolation level is typically used for reporting queries? A) Serializable B) Snapshot C) Read Committed D) Read Uncommitted
B) Snapshot
60
What is the significance of 'commit timestamp' in Databricks?
It helps to determine the visibility of changes made by transactions.
61
True or False: Databricks automatically manages transaction conflicts.
True
62
What is a 'transaction boundary'?
The point at which a transaction begins and ends, defined by 'BEGIN' and 'COMMIT' or 'ROLLBACK'.
63
Fill in the blank: In Databricks, ________ is used to prevent inconsistent reads.
isolation
64
What is a 'concurrency anomaly'?
An inconsistency that arises from the concurrent execution of transactions.
65
Multiple Choice: Which isolation level is least restrictive? A) Serializable B) Read Committed C) Snapshot D) Read Uncommitted
D) Read Uncommitted
66
What does 'repeatable read' isolation level guarantee?
It guarantees that if a transaction reads a row, it will read the same row again if it queries the same data before it commits.
67
True or False: Snapshot isolation can improve performance in read-heavy workloads.
True
68
What is the main goal of transaction management in Databricks?
To ensure data integrity and consistency across concurrent transactions.
69
Fill in the blank: In Databricks, ________ is essential for maintaining data accuracy during concurrent transactions.
isolation
70
What is a 'commit operation'?
An operation that finalizes changes made during a transaction.
71
Multiple Choice: Which of the following is a result of using lower isolation levels? A) Increased performance B) Increased consistency C) Increased contention D) Both A and C
D) Both A and C
72
What is the role of a 'transaction coordinator'?
To manage and oversee the execution of transactions to ensure they adhere to ACID properties.
73
True or False: Isolation levels can be set per transaction in Databricks.
True
74
What is a 'read lock'?
A lock that allows multiple transactions to read a resource but prevents writing.
75
Fill in the blank: The ________ isolation level allows transactions to read uncommitted changes.
Read Uncommitted