Transaction Management/transactSQL Flashcards

1
Q

What are transactions and what causes complications to arise?

A

A series of queries. The complication arises from that many might happen at the same time (concurrency) and that computers can fail (partial execution)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What can concurrency lead to?

A
  • Inconsistent data in the database
  • (for example two people being able to book the same seat in a cinema or the following person overriding the first person’s booking)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are transactions?

A
  • They state that a group of SQL statements must be executed together so that no conflicts arise
  • Transactions mean that specific events have to happen before another event
  • meaning that we can’t for example have two people book the same seat as we’ll have a transaction that means once the seat has been picked, another person can’t pick it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how do we ensure serialisable behaviour?

A
  • By telling a DBMS that a sequence of SQL statements forms a transaction.
  • It then knows that the transaction is to be executed as if in isolation from all other transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is partial execution?

A
  • Partial execution is when only part of a transaction gets carried out
  • This could happen due to a failure or a power outage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do DBMSs deal with partial execution?

A
  • Transactions make sure that the whole set of instructions in a transaction is carried out otherwise the actions done are dropped/not permanently implemented. Either none or all of the commands are done.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we state in SQL that a transaction should be executed as a whole or not at all?

A

Using the keywords ‘START TRANSACTION’ and ‘COMMIT’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do we do if we have to abort a transaction due to partial execution due to a failure?

A

Use the keyword ROLLBACK - if something fails and you don’t want any of the transaction to happen.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why can we ignore operations such a calculations when looking at transactions?

A
  • Because the OS does those operations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does read(X) do?

A
  • reads a database item X into a program variable (also named x for simplicity)
  • finds address of disk block that contains item x
  • copies that disk block into a buffer in main memory
  • copy item x from buffer to program variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does write(x)

A
  • finds address of disk block that contains item x
  • copies that disk block into a buffer in main memory
  • the program copies the new value into the correct location of item x in the buffer
  • either immediately or some time after it writes the item in the bugger to the correct disk block
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a schedule?

A
  • a bunch of transactions in a specific order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the two main types of schedules?

A
  • Serial schedules: executes transactions one after another
  • Concurrent schedules: interleaves operations from different transactions, while still preserving that the operation in each transaction happen in the right order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do each of the symbols represent?

A

Sn = id of the schedule
ri(x) = read(x) in transaction i
wi(x) = write(x) in transaction i
ci = commit in transaction i
ai = abort (“rollback”) in transaction i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Does order matter in serial schedules?

A

Yes because the data stored in shared variables between transactions can be different depending on transaction order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is true about serial and concurrent schedules?

A
  • All serial schedules are concurrent schedules
  • But not all concurrent schedules are serial schedules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When is something not a concurrent schedule?

A
  • When the order of the SQL statements inside the same transaction are executed out of order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why can it be a problem that two transactions are not isolated from each other?
How can we solve this?

A
  • issue of concurrent access
  • if two transaction are accessing the same item then the outcome can be inconsistent with the real world.
  • can solve this by undoing the second transaction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does it mean for a transaction should be atomic?

A

the transaction is indivisible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why can it be a problem if there’s a failure in the middle of a transaction?
How can we solve this?

A
  • issue of partial execution
  • leads to inconsistencies in the real world
  • can solve by just undoing the transaction when the computer comes back online
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is durability?

A

That any later changes did not disappear after having finished based on something someone else did in the database
- so one transaction did not incorrectly overwrite an item that another one wrote to previously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What do we use to make ensure that we’re never in the situation of having errors made due to a failure (partial execution) and concurrent access?

A

ACID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the properties of ACID?

A
  • A: Atomicity C: Consistency I: Isolation D: Durability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is atomicity?

A

A transaction is an atomic unit of processing
- an indivisible unit of execution
- executed in it’s entirety or not at all

Deals with failure by aborting a transaction
- we undo the word done up to the error point
- system recreates state of database before start of aborted transaction

Commit - no errors, entire transaction is executed, system is updated appropriately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is consistency?

A
  • correct execution of a transaction takes the database from one consistent state to another
  • when transactions don’t violate any constraints
  • database accurately reflects state of real world
  • if we have to abort then the database is not in a consistent state and we have to recreate the consistent state (which is the state of the database before the aborted transaction started)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Why are serial schedules consistent?

A
  • A serializable schedule always leaves the database in a consistent state. A serial schedule is always a serializable schedule because, in a serial Schedule, a transaction only starts when the other transaction has finished execution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are serialisable schedules?

A

schedules that are equivalent to serial schedules
(though there are multiple definitions of serialisability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is isolation?

A
  • a particular transaction should see itself as the only transaction in the database
  • the levels of isolation refer to how isolated a transaction should be from other transactions operations (such as modifying items used in both)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are the levels of isolation

A
  1. Read uncommitted
  2. read committed
  3. Repeatable read
  4. serialisable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the read uncommitted isolation level?

A
  • No isolation at all, you can read data which has not been committed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the read committed isolation level?

A
  • every item you read must have been committed before you can see it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the repeatable read isolation level?

A
  • every item you read must have been committed before you can see it
  • if you read the same thing twice in a transaction, you must get the same return value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the serialisable isolation level?

A
  • Has all the levels above
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is a serialisable schedule

A
  • A non-serial schedule is called a serializable schedule if it can be converted to its equivalent serial schedule
  • You have serial schedule TA then TB, and item X = 15 after both transactions have run
  • if you have some of TA then TB then TA then TB and item X still = 15, this schedule is not serial but it serialisable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is durability?

A
  • Once a transaction commits and changes the database, then these changes cannot be lost because of failure
  • the effect of a transaction on the database should not be lost after the commit point
  • we REDO the transaction if there are any problems after the update
  • durability means we can deal with media failure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What are some relational DBMS Components

A
  • User/application
  • transaction manager
  • logging and recovery
  • concurrency control
  • Buffers
  • Storage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is transaction management formed of?

A
  • Concurrency control
  • Logging and recovery
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Which components of transaction management satisfy ACID?

A

A - via recovery control (logging and recovery)
C - via scheduler - concurrency control
I - via scheduler - concurrency control
D - via recovery control (logging and recovery)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Advantages of serial schedules?

A
  • maintains consistency
  • maintains isolation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Advantages of concurrent schedules (non serial)?

A
  • more efficient in multi-user environments (transactions don’t have to wait for others to finish before starting)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What do concurrent schedules not guarantee?

A

Consistency or isolation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

How come concurrent schedules don’t guarantee consistency or isolation?

A
  • Because they can accidentally overwrite values in the buffer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

How much concurrency can we allow while satisfying Isolation and Consistency?

A

A schedule S is serializable if there is a serial schedule S’ that has the same effect as S on every initial database state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What do seralisable schedules guarantee?

A
  • consistency and correctness
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Why is serialisability hard to test?

A
  • because it depends on reads, writes, commits and non-database operations
  • non-database operations can be complex
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

How much concurrency can we allow while satisfying Isolation and Consistency … while being able to check it fast?

A

by having conflict seralisability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is a conflict in a schedule?

A
  • a pair of operations from different transactions that cannot be swapped without changing the behavior of at least one transaction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What can be defined as a conflict in a schedule?

A

A pair of operations from different transactions that access the same item and at least one of them is a write operation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

How can we tell if a schedule is conflict serialisable?

A
  • If it is conflict equivalent to a serial schedule
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

How can we tell if a schedule is conflict-equivalent?

A
  • two schedules S and S’ are conflict-equivalent if S can be obtained from S’ by swapping any number of (1) consecutive (2) non conflicting operations from (3) different transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Describe which schedules fit into other schedules

A
  • All serial, conflict-serialised and serialised schedules are concurrent schedules
  • All serial and conflict-serialised schedules are serialisable
  • all serial schedules are conflict serialisable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

How do we quickly show that something is not conflict-serialisable

A
  • we create a precedence graph with each transaction as a node and each conflict as a link between nodes
  • if there is a cycle within this graph then the schedule is not conflict-serialisable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

How do we quickly show that something is not conflict-serialisable

A
  • we create a precedence graph with each transaction as a node and each conflict is a link between nodes (only if op1 appears before op2) and we set the nodes out in order T1,T2,T3
  • if there is a cycle within this graph then the schedule is not conflict-serialisable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

why are cycles in the precedence graph bad?

A
  • They means that a contradiction has arisen (which could cause a deadlock)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What are the steps of finding a serial schedule from a precedence graph?

A
  • Find a transaction with only outgoing edges
  • you put that first in the schedule and remove the transaction from the graph
  • you repeat this process until there’s no nodes left.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

How does transaction scheduling work in a DBMS?

A
  • The scheduler gets fed operations and it can either execute them or delay them happening.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

How can we enforce conflict serialisability?

A

Using locks with a simple locking mechanism
- a transaction has to lock an item before it accesses it
- locks are requested from and granted by the scheduler
- each item is locked by at most one transaction
- each lock must eventually be released

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What are the symbols for locking and unlocking an item(x)

A

l1(X) for locking
u1(x) for unlocking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What are the rules for locking in a schedule

A

Every lock must be followed by an unlock
An item(x) must be unlocked by a transaction before being locked again by a different transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Why may not every schedule with simple locking may be serialisable?

A

There’s only one lock so other transactions have to wait to run whilst the first transaction has locked an item it wants to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is 2PL?

A
  • 2 Phase Locking is where we follow the pattern, all locks then unlocks, then the schedule is serialisable
61
Q

When are the two phases in 2PL?

A
  • Phase one is until the first unlock, phase 2 is from that unlock to the last unlock
62
Q

Why does 2 phase locking work?

A
  • Because before T1 unlocks the first item that T2 needs T1 locks the next item it needs so that T2 can do all the operations up until it needs the item that T1 is currently using
  • It ensures that T1’s transactions all occur first!
63
Q

what does two phase locking ensure?

A

Conflict serialisability

64
Q

What is an issue caused by 2PL?

A

Deadlocking because two transactions may be stuck waiting for the lock that the other transaction has.

65
Q

How can we make 2PL more flexible to avoid deadlocks?

A

Solution: Different lock modes
- use shared and exclusive locks

66
Q

What is a shared lock?

A
  • multiple transactions can have this at the same time
  • it means the transaction has access to read the item only
67
Q

What is a exclusive lock?

A
  • only one transaction can have this at a time
  • allows transaction to read and write to an item
68
Q

What are the rules regarding accessing locks

A
  • a shared lock is granted if no other transactions hold an exclusive lock on the item
  • an exclusive lock is granted if no other transaction holds a lock of any kind on the item.
69
Q

What are the symbols for shared and exclusive locks with item X and transaction 1

A

sl1(X) - shared lock
xl1(X) - exclusive lock

70
Q

What happens when a transaction has a shared lock on an item and requests to get an exclusive lock (being it is the only transaction w a lock of any kind on the item)?

A

The shared lock gets updated to an exclusive lock

71
Q

When is there a risk of deadlock with shared & exclusive locks?

A
  • if a shared lock on an item X can be upgraded later to an exclusive lock on X in order to be friendly to other transactions
72
Q

What are update locks?

A
  • they are requested by items to read (not write) an item
  • may be upgraded later to an exclusive lock (at that point then no other shared locks can be upgraded)
  • This is granted to at most one transaction at a time
  • Helps to avoid deadlocking
73
Q

Why would T3’s request for a shared lock be denied?

A

Because another transaction holds an update lock

74
Q

What are the 3 levels of using locks with multiple granularity?

A
  • May lock relations (whole table)
  • May lock disk blocks
  • May lock tuples (couple of rows/row)
75
Q

What is the problem with having too coarse granularity?

A
  • less concurrency which may cause unnecessary delays
76
Q

What is the problem with having too fine granularity?

A
  • high overhead (need to keep track of all the locked items)
77
Q

Need to prevent issues such as the following to guarantee conflict serialisability

A
  • one transaction having a shared lock on a tuple
  • another transaction having an exclusive lock for the relation
78
Q

What are Intention Locks (a.k.a. Warning Locks)

A

If a transaction wants to lock something it has to put intention locks on super items (higher levels that contain item x) so conflict serialisability is ensured
-to get access to an item you need to have the intention locks on the levels above

79
Q

What is the notation for the two intention locks?

A
  • IS1(x) = intention to request a shared lock on a sub item
  • IX1(x) = intention to request an exclusive lock on a sub item
80
Q

Why might a transaction abort?

A
  • Errors whilst executing transactions
  • deadlocking
  • Explicit request
81
Q

What events are beyond the DBMS’s Control?

A
  • Media failures: A crash is caused if the rotating head in a Hard disk drive crashes
  • Catastrophic events: need to store databases/copies of it in a physically different location and hope that not all of the locations get physically damaged.
  • System failures: power failures etc, information about the active state of the database is lost
82
Q

What is logging in DBMSs?

A
  • writing activities into a log so that a desired database state can be recovered later
83
Q

What are important activities that commonly feature in a log?

A
  • starts of transactions, commits, aborts
  • modification of database items
84
Q

What techniques can be use to do logging in DBMSs?

A
  • REDO logging
  • UNDO logging
  • Combinations of the two REDO/UNDO logging
85
Q

Where are the database and log stored?

A

On the hard disk = secondary storage

86
Q

What does stored locally mean?

A

It means stored in RAM

87
Q

Where is the buffer?

A

In main memory = RAM

88
Q

What is UNDO logging?

A
  • logs activities with the goal of restoring a previous consistent database state
  • maintains atomicity
89
Q

What are typical entries stored in an UNDO log?

A
  • <START>: Transaction T has started
    </START>
  • <COMMIT>: Transaction T has committed
    </COMMIT>
  • <ABORT>: Transaction T was aborted
    </ABORT>
  • <T, X, v>: Transaction T has updated the value of database item X, the old value of X was v
90
Q

Describe the UNDO logging procedure?

A

1: if T1 updates database item X and old value = v then <T,X,v> must be written to log on disk before X is written to disk
2: If T1 commits, then <COMMIT> must be written to disk as soon as all database elements changed by T have been written to disk</COMMIT>

91
Q

What is the simplified version of the UNDO logging procedure?

A
  • read/write/lock/unlock operations
  • write values to go into log in buffer
  • flush pervious values of items to log file on disk
  • write changes of item values to disk
  • once done, write COMMIT T to buffer in main memory
  • flush COMMIT T to log file on disk
92
Q

What if a system failure occurs? (The Recovery Manager)

A
  • If T has committed successfully then no recovery process is needed
  • if T has not committed before the failure then we must undo all the updates to the database items that were written to disk (we can do this by looking at what the old value of the item was in the log), we undo up to the last START T or COMMIT T
93
Q

Describe recovery with Undo logging?

A
  • if an error occurs, the recovery manager restores the last consistent database state
  • it traverses the log backward to do this.
94
Q

When do we write <ABORT> in the log file?</ABORT>

A
  • We replace each <T,X,v> statement with <ABORT> for each uncommitted transaction T that was not previously aborted and call flush log</ABORT>
95
Q

If a schedule has 3 transactions in it, how many <COMMIT> statements should there be in the Log file if all of the transactions successfully wrote all their item updates to the disk and there was no failures?</COMMIT>

A

There should be 3 <COMMIT> statements</COMMIT>

96
Q

What if a transaction aborts in UNDO logging?

A
  • Use the UNDO log like before to recover all the changes that were made
  • but we only change the values for one specific transaction not all of them.
97
Q

What is REDO logging?

A
  • logs activities with the goal of restoring committed transactions
  • it ignores incomplete transactions
  • maintains durability
  • log stores same commands except <T, X, v>: stores the new value of the item not the old one
98
Q

What are typical entries stored in an REDO log?

A
  • <START>: Transaction T has started
    </START>
  • <COMMIT>: Transaction T has committed
    </COMMIT>
  • <ABORT>: Transaction T was aborted
    </ABORT>
  • <T, X, v>: Transaction T has updated the value of database item X, the new value of X was v
99
Q

What is the aim of REDO logging

A

To restore things for transactions that have finished, even if they haven’t been changed on the disk yet

100
Q

Describe the REDO logging procedure?

A
  1. T1 writes all log records for all updates of items to log on disk
  2. T1 writes <COMMIT> to log on disk</COMMIT>
  3. T1 writes all committed updates to database on disk
101
Q

When are we allowed to output things during UNDO logging as opposed to REDO

A
  • In UNDO logging we can output items values before they have been committed
  • In REDO logging items have to have been committed before we can output them
102
Q

What is the simplified version of the REDO logging procedure?

A
  • read/write/lock/unlock operations
  • write new values to go into log in buffer
  • once done, write COMMIT T to buffer in main memory
  • flush new values of items stored in buffer to log file on disk
  • flush COMMIT T to log file on disk
  • write changes of item values to disk for committed items only
103
Q

Describe recovery with Redo logging?

A
  • Need to check whether <COMMIT> has been written to the log file on disk or if it's only in the log in the buffer in main memory</COMMIT>
  • if <COMMIT> has been written to the log file on disk then we know all the transactions have been written to the disk</COMMIT>
  • if <COMMIT> is only in the log in the buffer then T hasn't written anything to the disk</COMMIT>
104
Q

Describe the recovery procedure with Redo logging?

A
  • traverse the log from first to last item
  • if we see <T,X,v> and T has a <COMMIT> log record then change the value of X on the disk to v</COMMIT>
  • for each incomplete transaction T, write <ABORT> into the log on disk</ABORT>
105
Q

What if a transaction aborts in REDO logging?

A
  • So we write <ABORT> at the end of a transaction that hasn't committed</ABORT>
  • we don’t need to take any further actions since none of the changes were written to the disk yet.
106
Q

What is Undo/Redo logging?

A
  • does a combination of undo and redo logging
  • this way it maintains atomicity and durability
  • instead of writing <T,X,v> to the log file it writes <T,X,v,w> where v is the old value and w is the new value of X
107
Q

Why does Undo/Redo logging provide more flexibility?

A
  • Because it stores enough information that we can do undo logging and redo logging and therefore we can do both things when we need to
  • so it doesn’t matter what order you do the COMMIT T command now. This means you’re more free to do things in the order that you like.
  • Say one that makes it faster because it’s better to make bigger updates on the disk and by doing this you’re allowed to do it whenever you want and therefore you can make bigger updates.
108
Q

Which logging process is it better to do big updates to the disk with?

A
  • Redo logging, but we need to be careful only to use the last committed value instead of the last value
  • because we can do these updates to the disk whenever and not have to worry about
  • advantage of doing this later is that you can do bigger updates at the same time which is faster on a real hard disk.
109
Q

Describe the UNDO/REDO logging procedure?

A
  1. write all log records for all updates of item values to buffer in main memory
  2. write all updates to disk
  3. <COMMIT> can be written to disk before or after all the changes to the database have been written to the disk depending on what the DBMS prioritises
    </COMMIT>
110
Q

What is No Steal with Redo logging?

A
  • It ensures atomicity (as well as durability)
  • it means that uncommitted data may not overwrite committed data on disk
111
Q

In practice if you want Steal/No force what logging procedure should you use?

A

Undo/Redo

112
Q

What is simple checkpointing for undo logging?

A
  • the log is periodically checkpointed
  • so every t mins we put a checkpoint in the log
  • we don’t need to undo transactions before the checkpoint
113
Q

What is the procedure for simple checkpointing for undo logging?

A
  1. Stop accepting new transactions
  2. wait until all active transactions finish and have written <COMMIT> or <ABORT> record to log in buffer</ABORT></COMMIT>
  3. flush log updates to hard disk
  4. write a log record <CHECKPOINT></CHECKPOINT>
  5. Flush the log to disk again (with checkpoint command)
  6. Resume accepting transactions
114
Q

What is ARIES checkpointing?

A

does undo/redo logging but transactions do not write to buffers until they are sure they want to commit

115
Q

What is the procedure for ARIES checkpointing?

A
  • write <CHECKPOINT(T1, T2…) in log and flush it
  • T1,T2,… are the transactions in progress so they haven’t committed or aborted yet
  • write the content of the buffer to hard disk
  • write <END> to the log in the buffer and then flush it to the hard disk</END>
116
Q

How is recovery carried out with ARIES checkpointing?

A
  • process the undo/redo log as before
  • only redo (part of) the committed transactions in T1,T2… after <CHECKPOINT(T1,T2,…)>
  • undo all of the uncommitted transactions that come before <CHECKPOINT(T1,T2,…)>
  • as the ones before will have been written to the disk and the ones after won’t have
117
Q

What are the properties of Conflict-Serialisability
and how can they be enforced?

A
  • equivalent to serial schedules
  • ensures consistency and correctness
  • enforced by 2PL
118
Q

What are the properties of recovery and how can they be enforced?

A
  • ensure consistency as we can recover data base states
  • robust, works even after system failures
  • enforced using undo logging or redo logging or undo/redo logging or simple checking pointing with undo logs or aries checkpointing with undo/redo logs
119
Q

What are dirty reads?

A
  • when the isolation property is not fully enforced by setting isolation level to READ UNCOMMITTED
  • gains more parallelism by executing some transactions that would have to wait to prevent dirty reads
  • however dirty reads can slow down the system when transactions have to abort
120
Q

What is a cascading rollback?

A
  • if transaction T aborts, find all the transactions that have read items that were written by T
  • recursively abort all transactions that have read items written by an aborted transaction
121
Q

What can we break if we do a cascading rollback and abort transactions?

A
  • if we do not abort all the transactions that interacted with aborted transaction T, then we can risk breaking consistency and isolation
  • if we do abort them all we can break durability
122
Q

When is a schedule recoverable?

A
  • if a transaction T1 commits and has read an item X that was written before a different transaction T2
  • then T1 must commit before T2 commits (T2 must delay it’s commit)
123
Q

Which transactions will be included in a cascading rollback if a transaction aborts but the schedule is recoverable?

A
  • Only active transactions can be forced to commit
  • so transactions that have committed before T1 aborts won’t have to abort
124
Q

Why are non-recoverable schedules serialisable and recoverable schedules are non-serialisable?

A

recoverable schedules have to be in a specific order - that specifies R2(X), C1, C2 meaning that this is not serialisable because it’s not equivalent to a serial schedule

125
Q

What is an implicit assumption about Recoverable Schedules?

A

that all log records have to reach the hard disk in the order in which they were written (if they don’t this could mean a recoverable schedule becomes non-recoverable)

126
Q

What is a cascadeless schedule?

A
  • If each transaction in it reads only values that were written by transactions that have already committed
127
Q

What do cascadeless schedules ensure?

A
  • No reading of dirty data
  • no cascading rollbacks
128
Q

What are cascadeless schedules?

A
  • non-serialisable, because they have to be in a certain order, the reads of other transactions have to be after others commit, so the order matters so they’re not serialisable
  • they are recoverable because the item doing a dirty read is doing it from a transaction that has already committed, therefore it commits after the first transaction
129
Q

What is a strict schedule?

A

A schedule where each transaction in it reads and writes only values that were written by transactions that have already committed

130
Q

What do strict schedules achieve?

A
  • Cascadeless
  • serialisability
  • recoverable
131
Q

What is Strict Two-Phase Locking (Strict 2PL)

A
  • variant of 2PL
  • a transaction T must release any lock (that allows T to write data) until T has committed or aborted and the commit/abort log record had been written to disk
132
Q

What does Strict Two-Phase Locking (Strict 2PL) ensure?

A
  • conflict serialisability
  • strict schedules
133
Q

How is 2PL and strict 2PL different?

A
  • in 2PL the commit command comes after the unlocks
  • in 2PL the commit comes before the unlocks
  • this is so that no other transactions can read/write to uncommitted data
134
Q

How the Types of Schedules are Related?

A
  • All strict and 2PL strict schedules are serialisable and conflict serialisable
  • some cascadeless and recoverable schedules are serialisable and conflict serialisable
135
Q

What problem is still not solved with strict 2PL?

A

Deadlocking

136
Q

What are two approaches to deadlock prevention?

A
  • Detect deadlocks and fix them (rollback/ restart transactions)
  • Enforce deadlock-free schedules
137
Q

What are the deadlock detection approaches?

A
  • Timeouts: assume a transaction is in deadlock if it exceeds a given time limit
  • waits-for-graphs: shows transactions and dependencies, cycles mean deadlocks
  • Timestamp-based
138
Q

Describe timestamping for deadlock detection

A
  • each transaction T is assigned a unique integer TS(T) upon arrival at the scheduler
  • if T1 arrived earlier than T2, we require the TS(T1) < TS(T2)
  • Time stamps do not change even after restart
  • if transactions arrive at the same time they still get different timestamps
139
Q

How Are Timestamps Used?

A
  • we use time stamps to decide which transactions can wait longer (for a lock/or to start)
  • we want to prevent cyclic dependencies because as we saw before this creates deadlocking
140
Q

What is the wait to die scheme?

A

(“older transactions always wait for unlocks”)
- If T1 is older than T2, and requests an item then it waits for T2 to be finished with it
- If T2 is younger than T1 and requests an item, it dies (rollsback and starts again)
- this makes sense because if a transaction is younger it means it’s done less work, so it less of a waste to restart

141
Q

What is the wound-wait scheme?

A

(“older transactions never wait for unlocks”)
- if T1 is older than T2 and requests an item from T2, T2 is immediately rolled back unless it has committed
- if T2 is younger than T1 and requests an item from T1, it’s allowed to wait for T1 to unlock the item

142
Q

What is the same for wait-to-die and wound wait time stamping?

A

In both methods, the older transaction never dies or rollsback

143
Q

Why does Wound-Wait Work?

A
  • At all times the oldest transaction keeps running
  • hence when that finishes the same occurs for the transaction that arrives directly before it
144
Q

Why does the younger transaction die in wait-to-die and wound-wait?

A

wait-to-die: dies of timeout
wound-wait: dies because it’s holding a lock the older one needs

145
Q

how can we use time stamping to enforce deadlock free schedules?

A
  • schedule transactions as if they are executing each transaction instantly
  • we should think of transactions as being completely isolated so each transaction starts and finishes before the next
  • if two arrive at the scheduler at the same time we just randomly pick one to start first
  • equivalent to serial schedules
146
Q

What is different in timestamping to create deadlock free schedules?

A
  • Here each transaction is assigned a new time stamp number whenever it restarts
  • this means transactions only go in reverse consecutive order
  • holds information about the last transaction to read from and write to an item
147
Q

What are the advantages and disadvantages of time based scheduling?

A

advantages: conflict-serialisable schedules, no deadlocks

disadvantages: cascading roll backs, starvation, many restarts

148
Q

How do we ensure strictness and not get deadlocks with timestamping?

A
  • we delay read or write requests until the youngest transaction who wrote X has committed or aborted
  • Just lock the transaction until the earlier one has finished
  • So we won’t get deadlocks because the oldest one can always move forwards using this principal
149
Q

What is Multiversion concurrency control

A
  • A more advanced version of timestamping
  • just have multiple versions of each item in your database
  • So you don’t overwrite the values in an item, you just write a new item for each new timestamp.
  • you can just discard the old items when they stop being relevant
    -we only need to restart transactions if you try to write and the RT is later than the write time stamp
  • Before we had to restart if either the read or write stamp was too young, now we only have to restart if the read timestamp is too young.