Examen 3. Presentacion 4 Flashcards
(41 cards)
What is cloud computing?
Cloud Computing (CC) is a model for enabling
ubiquitous, convenient, on-demand network
access to a shared pool of configurable
computing resources (e.g., networks, servers,
storage, applications, and services) that can be
rapidly provisioned and released with minimal
management effort or service provider
interaction
public cloud
when it is made available in a pay-as-you-go
manner to the general public
private cloud
when the cloud infrastructure is
operated solely for a business or an organization
Cloud Computing services
Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), Infrastructure-as-a-Service
Software-as-a-Service (SaaS)
Applications are accessible from several client devices.
The provider is responsible for the application for example : Email, CRM, Collaborative, ERP.
Platform-as-a-Service (PaaS)
The client is responsible for the end-to-end life cycle in
terms of developing, testing and deploying applications
Providers supplies all the systems (operating systems,
applications, and development environment)
For Example: Application Development, Decision Suppport, Web, Streaming
Infrastructure-as-a-Service (IaaS)
In this type of service the client manages the
storing and development environments for Cloud
Computing application such as the Hadoop
Distributed File System (HDFS) and the MapReduce
development framework.
For example: Caching, Legacy, Netoworking, etc.
NoSQL
Not Only SQL, refers to an
eclectic and increasingly familiar group of nonrelational data management systems; where
databases are not built primarily on tables, and
generally do not use SQL for data manipulation.
NoSQL Focus
NoSQL databases focus on
analytical processing of large scale datasets,
offering increased scalability over commodity
hardware.
Why NoSQL Databases?
1.- The exponential growth of the volume of data
generated by users, systems and sensors.
2.-The increasing interdependency and complexity of
data accelerated by the Internet, Web2.0, social
networks and open and standardized access to
data sources from a large number of different
systems.
The CAP-Theorem
postulates that only two of
the following three different aspects of scaling
out can be achieved fully at the same time
Aspects of scalling
Strong Consistency, High Availability, Partition-tolerance
CAP theorem and NoSQL
Many of the NoSQL databases above all have loosened
up the requirements on Consistency in order to achieve
better Availability and Partitioning (AP).
primary uses of NoSQL Database
- Large-scale data processing (parallel processing over
distributed systems). - Embedded I-R (basic machine-to-machine information
look-up & retrieval). - Exploratory analytics on semi-structured data (expert
level). - Large volume data storage (unstructured, semi-structured,
small-packet structured).
Classification of NoSQL Databases
Key-Value stores, Document databases, WideColumn, Graph databases
Key-Value store
these Data Management Systems (DMS) store items as alphanumeric identifiers (keys) and associated values in
simple, standalone tables (referred to as ―hash
tables‖). The values may be simple text strings or
more complex lists and sets.
Document databases
designed to manage
and store documents which are encoded in a
standard data exchange format such as XML,
JSON or BSON
Key-Value store values
The values may be simple text strings or
more complex lists and sets.
Document Databases Value
he values may be simple text strings or
more complex lists and sets.
Wide-Column
This type of
NoSQL Database employs a distributed,
column-oriented data structure that
accommodates multiple attributes per key
Graph Databases
Graph databases replace
relational tables with structured relational
graphs of interconnected key-value pairings
Hadoop HBase
HBase is a column-oriented
database management system that runs on top
of Hadoop Distributed File System (HDFS).
The reason to store values on a per-column
basis
for specific queries, not all of the values are
needed
HBase Architecture
Consistency, Atomic Read and Write, Sharding, High Availability, Client API, Scalability, Distributed Storage, HDFS/Hadoop integration, Data Replication, Load sharing and Support for Failure, API Support, MapReduce Support, Sorted Row Keys, Real Time Processing of Data,