Lecture 2 Flashcards

1
Q

Scaling methods

A
  • Scaling up - Vertical scaling
  • Scaling out - Horizontal scaling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sharding?

A

is a database architecture pattern related to horizontal partitioning - the practice of separating one table’s rows into
multiple different tables, known as partitions.
Concept of sharding does not sit well with relational databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

NoSQL database

A

“Not Only SQL”
- do not use SQL as their primary query language
- providing access by means of Application Programming Interfaces (APIs).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Types of NoSQL database

A

Four main types each has model:

  • Key-value
  • Document
  • Wide column stores
  • Graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data warehouse

A
  • A system used for reporting and data analysis
  • Core component of business intelligence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ETL (Extract, Transform, Load)

A

ETL (Extract, Transform, Load)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data mart

A

A simple form of a data warehouse that is Focused on a single subject (or functional area)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Lakes

A

A storage repository of data holds a vast mount of raw data in its native
format until it is needed. There is no hierarchy or organization among
the individual pieces of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data Integration

A

A set of processes used to retrieve and combine
data from disparate sources into meaningful and valuable
information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Big Data Integration techniques

A
  • Schema Mapping
  • Record Linkage
  • Data Fusion
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Steps in Data Science Process

A
  • Exploring Data
  • Data Pre-processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

CRISP-DM: Cross Industry Standard Process for Data Mining

A

A well adopted methodology for data mining
Six Phases
- Business understanding
- Data understanding
- Data preparation
- Modeling
- Evaluation
- Deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly