Data Intensive Ch7 - Transactions Flashcards

1
Q

Why we need transactions - what can go wrong

A
  1. Database software/hardware can fail in the middle of write
  2. App can crash in the middle of series of operations (f.ex. due to out of memory)
  3. network interruption can cut off app from db
  4. several clients can write to db at the same time and overwrite each other writes
  5. Client may read some partial, unfinished update
  6. Race conditions between concurrent client writes can cause bugz
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Transaction

A

Mechanism to simplify handling “what can go wrong” scenarios
Way for an application to group several reads & writes into one logical unit - all operations are executed as one operation which either
- succeeds (commit)
- fails (abort, rollback)
If failed - app can safely retry
App need not to worry about partial failure

“Mechanizm, ktory pozwala nie martwic sie np. w sytuacji jesli musimy wykonac serie operacji np. odjac kwote z konta A i dodac do konta B tym, ze w srodku wydarzy sie katastrofa”

“Simplify the programming model for apps accessing db”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ACID

A

Set of safety guarantees provided by transactions

Now marketing term because one implementation might not equal another (isolation is often amigious)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Atomicity

A

Atomic - cannot be broken down into smaller parts
NOT about concurrency (other client sees half finished transaction)!
It’s about “All or nothing” - ability to abort transaction on error and having all writes discarded
Abortability more than atomicity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Consistency

A

App-specific notion of DB being in “good-state”
Idea - data has certain statements (INVARIANTS) that must be always true (credits + debits are always balance)
It’s app responsibility to define constraints, DB only enforces them and stores data
It’s property of the application actually
Joe Hellerstein remark - C in ACID was tossed in to make the acronym work

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Isolation

A

Concurrency issues arise if several clients access the same record at the same time
Concurrently executing transactions cannot “step on each other’s toes”
Serializability - each transaction can pretend it’s the only one running on the entire DB; DB ensures on commit the final result is the same as if the actually run one after another even if it’s not true
In practice - it’s expensive so weaker isolation levels are used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Durability

226

A

Promise the committed transaction will not lose any data even if failure occurs
Usually involves using write-ahead log (in case data structures on disk are corrupted)
In replicated db usually means that value was replicated to N nodes
Perfect durability is not possible - what if all disks and backups are destroyed at the same time?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dirty read

229

A

Violation of isolation
One trx reads another’s uncommitted writes
- Trx updates multiple objects - dirty read means trx may see only some of them updated (user sees unread emails but not the updaated counter)
- Trx aborts - any writes are rolled back - dirty read - other trx can see data that’s never actually committed
Example: listing unread emails and storing counter of unread emails separately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why retrying aborted transaction is not perfect error handling mechanism

A
  • Trx succeeds but network fails to ACK - client would think it failed and retry causes trx to be performed twice (requires app level de-duplication, unique ids for trxs)
  • if error was due to overload retrying will only add fuel to the fire (use exponential backoff and limit retries)
  • some retries are pointless (constraint violation)
  • if trx has side effects outside of db they can happen even if failed trx was aborted (sending emails)
  • client process fails during retrying - data it was trying to write is lost anyway…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why concurrency bug are hard to find by testing (so you better use appropriate isolation levels)

A

They are rare - you need to be unlucky with the timing. Hard to repro.
It’s hard to reason about concurrency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Weak isolation levels

A

Non-serializable isolation level that can be used in practice when serializable is too costly for performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Read committed

A

The most basic trx lvl
No dirty reads (see only committed data)
No dirty writes (can only overwrite committed data)
Default setting in Postgres and many

Implementation of read committed:
Against dirty writes - most commonly - row-lvl locks held until trx is committed/aborted; other trx must wait
Most DBs use MVCC as they support snapshot-isolation anyway

Dirty reads - use the same lock, aquire and release after reading (read can’t happen when object is dirty)
Doesn’t work well - short, read-only trx can get stuck waiting for long running trx to complete;
Most DBs return old committed value until trx holding the lock commits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

No dirty writes

A

For concurrent updates to the same object we can assume later write overwrites the earlies. What if earlier is part of trx not committed yet though - dirty write happens. Prevention - delay later write until trx with first one is committed

  • trx updating multiple objects - Alice and Bob buys a car which requires 2 db writes. Bob wins at listing, Alice at invoices. str 236 7.5 fig
  • Read committed does not prevent race condition for counter increment

Actually fun fact in PG read commited reads the latest commited value
https://www.postgresql.org/files/developer/concurrency.pdf

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is read skew

A

Example of nonrepeatable read.
Can happen in read commited isolation lvl.
Cannot be tolerated for:
- backups
We make copy of entire DB. During the process writes are still accepted. It’s not acceptable to have some parts of db in old and some in new state. After restore the snapshot would be wrong.
- Analytical queries, integrity checks
Must see consistent database snapshot or will report invalid results
For the above: snapshot isolation is recommended

Example:
Alice has 2 accounts, 500 each, 1000 total.
Some trx transfers 100 from one to another.
Alice is unlucky to read balance of one account before transfer trx starts and the other’s after it’s done. So she sees 900 dollars total (or 1100, depends which account is read first - “from” or “to” one).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to implement snapshot isolation lvl?

A

dirty writes:
Write locks to prevent

Dirty reads:
DB keeps several different commited versions of an object/row
Transactions see consistent db state at point of time snapshot was taken
Multi version concurrency control MVCC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

MVCC

A

Multi version concurrency control
How implemented in PG:
1. Transaction is given unique, always-increasing ID
2. When trx writes to DB data is tagged with ID of writer
3. created_by internal field for each row in table
4. deleted_by internal field for each row in table for deletes - initially empty, when delete happens trx ID of requesting trx is there; garbage collection removes rows marked for deletion after row is inaccessible
5. Update is translated into delete & create
page 240 figure 7-7

17
Q

Trivia: Postgres transaction IDs

A
Unique
Always increasing
32bit integers
Overflow after ~4kkk transactions
PG's vacuum process cleans up to prevent overflow from affecting the data
18
Q

Snapshot isolation visiblity rules for consistent snapshot

A

When trx starts - get list of all concurrent trx which are in progress.
Their later-committed changes will be ignored
Writes made by aborted trx are ignored.
Writes made by trx with greater trx id are ignored.
Other writes visible.

If row was deleted by not-commited yet trx or one started later - row is still visible to the running trx
If row was created by not-commited yet trx or one started later - row is not visible

19
Q

Dealing with index updates with MVCC

A
  • Index points to all version of an object
    Index query filters out versions not visible to current trx
    Garbage collection removes versions no longer visible rows and index entries

Optimizations exists - if different versions of an object fits the same page - write it there -> no need to update idx

  • Append-only/copy-on-write B-Trees
    Creates a copy of each modified page
    Parent pages up to the root are copied and updated
    Pages not affected are left untouched
    Every trx creates new B-tree roots
    Background process for compaction/GC is required
20
Q

Lost updates

A

Concurrent write-write conflict
Read value -> modify it -> write back

Incrementing a counter
- Easy - atomic write operators

Dataset Keywords example
-> Adding an element to the list in json document (which jsonb column essentially is)

21
Q

Lost updates prevention

A

Atomic write operators
update set x = x + 1 where …; usually works

Explicit locking
Select for update

Auto detection of lost updates
Allow trx to execute in parallel, abort trx which commit would cause lost update. Client must retry read-modify-write cycle
PG Snapshot isolation does this

Compare-and-set
Update only happens if value to compare (which update was based on) is still current value

Conflict resolution, replication
Siblings (multiple conflicting versions returned to client)
Last write wins
Auto-merging (increment counters, Riak)

22
Q

Write skew

A

Example:
2 doctors on call.
Both take a leave at the same time.
Consistency check min 1 doc on call won’t work as it’s based on snapshot

Other examples:

  • booking a room (select bookings for the room in given period of time and update booking if one does not exist)
  • Chaning username (unique username taken by 2 users concurrently)

Not a lost update or dirty write - modifies 2 separate rows

Trxs reads some objects and modifies some of them (different trx different object)

Usually there’re no multi-object constraints (can be simulated trigger or materialized view)

Serializable isolation level is one remedy
Explicit lock on rows involved is the second best

Generalization pattern:

  • select made to check some condition on multiple matching rows
  • app code business logic based on the result
  • insert/update/delete if app goes ahead
23
Q

Phantom row skew

A

In write skew problem when select returns no results and in the meantime a new row that would be returned is added.

What’s phantom:
One trx changes the result of a serach query of another.
Snapshot isolation prevents phantoms for read-only queries, with read-write we have a write-skew potential

24
Q

Issues with isolation levels

A

Hard to understand and INCONSISTENTLY IMPLEMENTED in different DBs (reapeatable read is ambigious)

In large apps it’s difficult to tell if running given trx at given isolation level is safe (other business processes happen concurrently and might screw it over)

Static analysis may help but it’s meh
Testing for concurrency - hard and nondeterministic

Serializable isolation is the only foolproof option

25
Q

Serializable isolation level

A

Strongest lvl
Trxs running in parallel are guaranteed to have the same end result as if they were running serially, one after another

Performance-wise costly implementation - otherwise there would be no need to use weakly IL ;)

Implementation techniques:

  • literal serial execution
  • two phase locking
  • optimistic concurrency control techniques (serializable snapshot isolation)
26
Q

Serializable - literal serial execution implementation

A

Single-threaded execution loop
Wasn’t possible before due to memory limitations
OLTP trx are usually short, OLAP are long but should run on consistent snapshot (repeatable reads!)

No interactive multi-statement transactions -> too much time spent on network communication and waiting -> throughput would be terrible
App sends entire trx code as stored procedure

Stored procedures are meh though
- each DB - different syntax
- hard to debug code running in a db, harder to monitor, test
- db is more perf-sensitive than web server -> malicious stored procedure could impact other app instances
+ some dbs use langs like java

Other idea - paritioning
One single-threaded worker per partition assuming stored procedure doesn’t need cross-partition data -> this is possible but extra overhead

Summary:

  • trx must be small and fast or will stall all trx processing
  • dataset must fit in memory
  • throughput low enough to be handled on single CPU or we need partitioning
  • cross-partitions are limited
27
Q

Serializable - 2 phase locking implementation

A

Writers block writers and readers
Readers also block writers
(Snapshot isolation assumes readers never block writers, writers never block readers)

Lock for each object in db
Shared or exclusive mode
When trx reads - acquire the lock in shared mode
- when there’s trx with exclusive lock - shared lock must wait
When trx writes - acquire the lock in exclusive mode
- waits until any existing (either modes) locks are released
When trx reads and then writes it can upgrade from shared to exclusive (same rules as acquiring exclusive lock)
After acquire - lock is held until end of transatction
- hence 2 phase - acquire and release

28
Q

2 phase locking - why it’s not a silver bullet?

A

Many cons like:

  • deadlocks can happen
  • Performance issues - acquiring lock
    • unstable latencies - no one knows how much time trx will have to wait for another
  • performance issues - limited concurrency
    • hence limited throughput and latency
29
Q

What’s a predicate lock

A

Similar to shared/exclusive lock
Is not acquire per object but set of objects matching a criteria (predicate)
Think of bookings within given time period

Applies to objects that do not yet exist in db - phantoms

Cons:
low performance if each trx must check conditions for any matching locks

30
Q

Index-range locks

A

Simplified predicate locks matching greater set of objects
Instead of matching a bookable room in given time period - lock the whole room fo any time and date or lock all rooms in given time perdiod
Indices on room id, start and end time of booking

Lower overhead than predicate lock but they lock bigger range of objects than needed

31
Q

Serializable Snapshot Isolation

A

Optimistic concurrency control technique
Instead of blocking something potentially dangerous - trx continues hoping that all will turn all right.
When trx commits db checks for any potential isolation violations - if so ABORT
Performs poorly when there’s high contention (lots of aborted trx) - lots of waste throughput if system is already overloaded is never good
Low contention - performs better than pessimistic methods

Based on snapshot isolation with added algorithm for detecting serialization conflicts among writes and determing which trx needs to be aborted

Decisions based on an outdated premise - as with doctors on call

More predictable latency
No waiting for locks
Not limited to throughput of single CPU

Consider rate of aborts

32
Q

Pessimistic concurrency mechanism

A

Based on the principle that if anything can go wrong it’s better to wait until it’s safe again
Indicator being a lock held on some object trx is trying to access
Serial execution - pessimistic to the extreme, total mutual exclusion

33
Q

Decisions based on an outdated premise - detecting query result change

A

Detecting stale MVCC reads
tracking when trx ignores another’s writes due to MVCC visiblity rules
If any of ignored writes were commited at the time trx is commited - abort
Do not abort on read as db does not know
- if trx is going to do any writes
- if other trx that made a write will not abort

Detecting writes that affect prior reads
Trx modifies data after it’s been read
If reading trx commits then writing trx must be aborted