Distributed data concepts Flashcards
(31 cards)
what is a transaction
a logical unit of work which either completes in its entirety or not at all, its aim is to keep the database consistant
what do transactions end with
COMMIT – commit the changes and successfully end
ROLLBACK – an error has occurred so undo everything from the transaction
Programmatic SQL – normal termination of the program is a COMMIT
Programmatic SQL – abnormal/error termination is a ROLLBACK
ACID properties of a transaction
Atomicity, Consistency, Isolation, Durability
What does Atomicity mean
completes in its entirety or not at all
what does consistency mean
the database must be in a consistent state - a DBMS or application developers need to ensure this
what does Isolation mean
each transaction executes independently of others
what does durability mean
changes made by a transaction must persist - recovery systems must ensure this
OLTP
Online transaction processing
Quick real-time access to data to read or modify it
OLAP
Online Analytical Processing
involves fewer, more intensive transactions than OLTP
Problems with transactions
The lost update problem
The dirty read or uncommitted dependency problem
the inconsistent analysis problem
The lost update problem
an update done to a data item by a transaction is lost as it is overwritten by the update done by another transaction
The dirty read problem (uncommitted dependancy problem)
A dirty read occurs when a transaction reads data that has not yet been committed
The inconsistent analysis problem
when one user is reading the several pieces of data when another is updating them data the user reading the data may end up with a mix of old and new values
How to fix the problems with transactions
making all transactions run serially, this causes performance issues if only one transaction is run at once
read operations can be run in parallel
if a transaction is writing to one part of the database, it can still access other parts which arent affected by the update
what is locking
transactions lock part of the database before updating
what is a shared lock
the data item can only be read by the transaction
what is an exclusive lock
read or write
what is a deadlock
it is possible that 2 transactions are each waiting for the other to release a lock
how can a deadlock be resolved
timeouts - transaction rolls back after a certain amount of time
deadlock detection - rollback the transaction which would cost the least to stop
deadlock prevention - try look for the problem in advance (not common as is tricky to do)
what is a distributed database
the data is stored in different physical locations but the DBMS makes it invisible to the end-user
this causes more performance issues but has more storage
CAP theorem
Consistency
Availability
Partition tolerance
Consistency(CAP)
the database must be in a consistent state, transactions ensure this by rolling back if an error occurs when in an inconsistent state
availability
every query request gets a response
partition tolerance
distributed databases can cope with network failures / network delays