ETL Flashcards

1
Q

Define Extraction

A

Extraction is to collect data from multiple targeted
sources as SQL or NoSQL databases, cloud platforms or
XML file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why Extraction is the most complicated task in the ETL ?

A

because many sources are in a way that lacks
the quality or quantity required (unsatisfactorily), and
determining the eligibility for extraction is not an easy
process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two types of extraction ?

A

logical extraction and
physical extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two kinds of logical extraction ?

A

Full extraction and
Incremental extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s the difference between the two of them ?

A

Full extraction is used
when the system can’t identify which data is updated whereas Incremental extraction is used to extract
and load only new or changed parts not the whole data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two kinds of physical extraction ?

A

Online extraction and Offline extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the difference between the two kinds of physical extraction ?

A

In online extraction,data is extracted directly from
source systems while in Offline extraction,data isn’t extracted directly from source systems, first, it’s copied to an external file, then our extraction process connects to that external file and starts
processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do we do in the transformation stage of ETL?

A

The data is transformed to meet the schema and
requirements of the destination.Data transformation refers to converting the structure or format of a
data set to match that of the target system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we do in the Loading stage of ETL?

A

It involves placing the data into the target system,
typically a cloud data warehouse, where it is ready
to be analyzed by BI tools or data analytics tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Name the types of Load ?

A

Initial load, incremental load ,and full refresh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do semantic models stand for ?

A

Semantic models can help business users abstract relationship
complexities and make it easier to analyze data quickly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s the difference between ETL and ELT?

A

Unlike ETL, where data transformation takes place in a staging
area before being loaded into the target system, ELT extracts
the raw data directly to the target system and transform it
there

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

OLAP

A

Online analytical processing (OLAP) is a technology that organizes
large business databases and supports complex analysis. It can be
used to perform complex analytical queries without negatively
affecting transactional systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

OLTP

A

The databases that a business uses to store all its transactions and
records are called online transaction processing (OLTP) databases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s MDX

A

Multidimentional expressions ,* A calculation/query language to express queries for
online analytical processing -OLAP, in a database
management system.Language to define, use and retrieve data from
multidimensional objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly