Examen 3. Presentacion 4 Flashcards

(41 cards)

1
Q

What is cloud computing?

A

Cloud Computing (CC) is a model for enabling
ubiquitous, convenient, on-demand network
access to a shared pool of configurable
computing resources (e.g., networks, servers,
storage, applications, and services) that can be
rapidly provisioned and released with minimal
management effort or service provider
interaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

public cloud

A

when it is made available in a pay-as-you-go
manner to the general public

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

private cloud

A

when the cloud infrastructure is
operated solely for a business or an organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cloud Computing services

A

Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), Infrastructure-as-a-Service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Software-as-a-Service (SaaS)

A

Applications are accessible from several client devices.
The provider is responsible for the application for example : Email, CRM, Collaborative, ERP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Platform-as-a-Service (PaaS)

A

The client is responsible for the end-to-end life cycle in
terms of developing, testing and deploying applications
Providers supplies all the systems (operating systems,
applications, and development environment)
For Example: Application Development, Decision Suppport, Web, Streaming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Infrastructure-as-a-Service (IaaS)

A

In this type of service the client manages the
storing and development environments for Cloud
Computing application such as the Hadoop
Distributed File System (HDFS) and the MapReduce
development framework.
For example: Caching, Legacy, Netoworking, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

NoSQL

A

Not Only SQL, refers to an
eclectic and increasingly familiar group of nonrelational data management systems; where
databases are not built primarily on tables, and
generally do not use SQL for data manipulation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

NoSQL Focus

A

NoSQL databases focus on
analytical processing of large scale datasets,
offering increased scalability over commodity
hardware.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why NoSQL Databases?

A

1.- The exponential growth of the volume of data
generated by users, systems and sensors.
2.-The increasing interdependency and complexity of
data accelerated by the Internet, Web2.0, social
networks and open and standardized access to
data sources from a large number of different
systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The CAP-Theorem

A

postulates that only two of
the following three different aspects of scaling
out can be achieved fully at the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Aspects of scalling

A

Strong Consistency, High Availability, Partition-tolerance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

CAP theorem and NoSQL

A

Many of the NoSQL databases above all have loosened
up the requirements on Consistency in order to achieve
better Availability and Partitioning (AP).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

primary uses of NoSQL Database

A
  1. Large-scale data processing (parallel processing over
    distributed systems).
  2. Embedded I-R (basic machine-to-machine information
    look-up & retrieval).
  3. Exploratory analytics on semi-structured data (expert
    level).
  4. Large volume data storage (unstructured, semi-structured,
    small-packet structured).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Classification of NoSQL Databases

A

Key-Value stores, Document databases, WideColumn, Graph databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Key-Value store

A

these Data Management Systems (DMS) store items as alphanumeric identifiers (keys) and associated values in
simple, standalone tables (referred to as ―hash
tables‖). The values may be simple text strings or
more complex lists and sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Document databases

A

designed to manage
and store documents which are encoded in a
standard data exchange format such as XML,
JSON or BSON

18
Q

Key-Value store values

A

The values may be simple text strings or
more complex lists and sets.

19
Q

Document Databases Value

A

he values may be simple text strings or
more complex lists and sets.

20
Q

Wide-Column

A

This type of
NoSQL Database employs a distributed,
column-oriented data structure that
accommodates multiple attributes per key

21
Q

Graph Databases

A

Graph databases replace
relational tables with structured relational
graphs of interconnected key-value pairings

22
Q

Hadoop HBase

A

HBase is a column-oriented
database management system that runs on top
of Hadoop Distributed File System (HDFS).

23
Q

The reason to store values on a per-column
basis

A

for specific queries, not all of the values are
needed

24
Q

HBase Architecture

A

Consistency, Atomic Read and Write, Sharding, High Availability, Client API, Scalability, Distributed Storage, HDFS/Hadoop integration, Data Replication, Load sharing and Support for Failure, API Support, MapReduce Support, Sorted Row Keys, Real Time Processing of Data,

25
Consistency
HBAse provides consistent read and write operations and thus can be used for high speed requirements. This also helps to increase the overall throughput of the system.
26
Atomic Read and Write
Atomic read and write means that only one process can perform a given task at a given time. For example when one process is performing write operation no other processes can perform the write operation on that data.
27
Sharding
HBase offers automatic and manual splitting of regions. This means that if a region reaches its threshold size it automatially splits into smaller sub regions.
28
High Availability
HBase provides Local Area Network(LAN) and Wireless Area Network(WAN) which supports failure recovery. There is a master server which monitors all the regions and metadata of the cluster
29
Client API
HBase offers access through the Java API which helps to programmatically access HBase.
30
Scalability
This is one of the important characteristics of non-relational databases. HBase supports scalability both in linear and modular form
31
Distributed Storage
This feature of HBase helps usage of distributed storage such as HDFS
32
HDFS/Hadoop integration
HBase can run on top of various systems such as Hadoop/HDFS
33
Data Replication
The data in HBase are replicated over a number of clusters. This helps to recover data in case of any loss and high availability of data
34
Load sharing and Support for Failure
HDFS is internally distributed and supports automatic recovery. As HBase runs on top of HDFS it is also automatically recovered.
35
API Support
HBase supports Java API which makes it easily available programmatically using java
36
MapReduce Support
HBase supports map reduce which helps in parallel processing of data.
37
Sorted Row Keys
HBase uses keys and stores it in lexicographical order thus optimizing the requests.
38
Real Time Processing of Data
HBase performs real time processing of data and supports block cache and bloom filters
39
Denormalization, Duplication, and Intelligent Keys (DDI)
It is about rethinking how data is stored in Bigtable-like storage systems, and how to make use of it in an appropriate way
40
table row keys
are byte arrays so almost anything can serve as a row key from strings to binary representations of longs or even serialized data structures
41
Design Aspects
Tables are declared up front at schema definition time, Rows are lexicographically sorted with the lowest order appearing first in a table, Columns are grouped into column families, The column family prefix must be composed of printable characters, The qualifying tail, the column family qualifier, can be made of any arbitrary bytes, Column families must be declared up front at schema, A {row, column, version} tuple exactly specifies a cell in HBase.