Lecture 12 & 13 Flashcards
(32 cards)
What is NoSQL Database?
Databases that do not follow the traditional Relational Model.
Why do we need NoSQL Databases?
To help with Big Data, in order to help with Horizontal and Vertical scaling of the Database
What is Horizontal scaling?
Adding more nodes to a system
What is Vertical scaling?
Adding more resources to a node
What is Availability?
a database always responding to queries
What is Consistency?
a database gives the same response at queries happening at the same time
what is Distributed Databases?
Splitting computational Load among different nodes
What is a Cluster?
The set of computers that co-operate to manage the Database
What is Two-phase Commit?
The algorithm used to enforce consistency during transactions
What is a Transaction?
a set of changes in a database that is treated as a single change
What does MVCC stand for?
Multi-Version Concurrency Control
What is MVCC?
it is a method that stores data in various versions to ensure availability and recovery of data from a partition by reconciling the single databases with revisions (data isn’t replaced, just given a new revision number)
What is Partition/Node Failure?
When a node in a cluster fails causing it’s data to become outdated.
What is MVCC used for?
Coarse-grained DBMS models like document-oriented DBMS e.g. CouchDB
What is Sharding?
The Horizontal partitioning of a databse i.e. the rows are partitioned in a subset that is stored on different servers.
Why do we use Sharding?
Allows better performance by distributed computer loads.
What kinds of Sharding are there?
Hash Sharding and Range Sharding
What is Hash sharding?
distributing rows evenly across the cluster
What is Range Sharding?
Similar rows (e.g. tweets from same area) stored on the same node
What is Replication?
The action of storing the same row on different node to have fault-tolerence
What are MapReduce Algorithms?
A paradigm suited to parallel computing of the Single-Instruction, Multiple-Data type.
How does MapReduce Work?
- Map: distribute data across machines
2. Reduce: hierarchically summaries data until a result is obtained.
Example of MapReduce?
See Billy Boy Diagram in Lecture 12
What is a document database? (e.g. CouchDB)
A Type of Database where data is stored in documents expressed as JSON