Ch 5.4 Data Warehouses and Data Marts Flashcards

1
Q

If a manaageer wants to know the profit margin on used books, how would they find this?

A

use SQL or query by example

this allows managers to look up things in the database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is QBE

A

making database queries that allow user to search the doc based on an example (string of text )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why do we need data warehouses and data marts if be have QBE or SQL

A

this is because if we want to look at trends, we need a complicated system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what do data warehouses and data marts have that sql doesnt

A

1) info is organized better than sql is
2) these databases are read only!!! no need for updating or processing
3) SQL is used to access one record at a time; Data warehouss and mart is designed to access large groups of related records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data warehouse

A

holds historiccal data organized by subject

-> very expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

data mart

A

cheaper alternative to data warehouses,

low cost and scaled down
-> implemented in less than 90 days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how are data warehouses and marts organized?

A

1) organized by business dimension or subject (customer, vendor, product, price etc)
2) use online analytical processing (analyzes data from end users)
3)integrated: data collected from multiple systems and integrated around subjects
4) time variant: have HISTORICAL data
5) non volatile: users cant change or update data
6) multidimensional: store data in more than 2 dimensions “data cube”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are components of the data warehouse environment:

A

1) source systems that provide data
2) data integration tech
3) diff architectures for storing data
4) diff tools and applications for users
5) metadata (Data about the data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Source systems

of data warehouses and data marts

A

“organizational pain point” motivates businesses to develop BI, this BI needs data to work

PROVIDE DATA TO THE WAREHOUSE OR MART

EX: operational data, erp, website data, 3rd party data,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

relational database vs multidinensional data warehouses and data marts

A

Relational database is like ERD tables: separate tables w PKs and FKs

Data cube: its a matrix of 3 dimensiosns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data integration- data warehouse environment

A

USES ETL

tech that processes and prepares data for usage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

steps of ETL

A

EXTRACTION: handwritted code or SQL queries
TRANSFORMATION: data integrated around a common key (student id, and all data around the student classes courses major parking)
- sometimes transformation is changing 0 and 1 to M and F
LOADED: loading data into the mart/warehouse (loading window)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

STORING DATA- environment of data warehouses/marts

A

1) most common: data warehouse
-> one single read-only version of truth

2) independant data mart: store data for a single/few applications (ex: marketing or finance) FUNCTIONAL AREA CENTRIC
-> not very effective, inconsistent data

3) Hub and spoke: a central data warehouse that has the data plus many dependant data martsMe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

METADATA- environment of data warehouses/marts

A

data about the data, things like data quality and the data wrehouse status

INCLUDES UPDATE SCHEDULES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

DATA QUALITY- environment of data warehouses/marts

A

QUALITY of data in the warehouse must be high!

DATA cleansing needs to be used to ensure data meets users needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

BI governance

A

establishing people committees processes to mainatin the data warehouse and provide for changes in data type or processings

17
Q

users

A

business value for users rises when data can be accessed quickly and easily for analysis/consolidation

18
Q

data lake

A

storage of all org data regardless of its source or format

-recieve data in any format (unstrcuture/structure)

  • recieive inconsistent data/ overlapping data
19
Q

benefit of data lake

A

1) Organizations can derive value from unlimited types of data.

2) Organizations do not need to have all of the answers in advance.

3) Organizations have no limits on how they can query the data.

4) Organizations do not create silos. Instead, data lakes provide a single, unified view of data across the organization

20
Q

how is data loaded into a data lake

A

1) Define the incoming data from a business perspective.
2) Document the context, origin, and frequency of the incoming data.
3) Classify the security level (public, internal, sensitive, restricted) of the incoming data.
4) Document the creation, usage, privacy, regulatory, and encryption business rules that apply to the incoming data.
5) Identify the owner (sponsor) of the incoming data.
6) Identify the data steward(s) charged with monitoring the health of the specific datasets

21
Q

Schema on read

A

A data lake works on a principle called schema-on-read. This means that there is no predefined schema into which data needs to be fitted before storage. Only when the data is read during processing is it parsed and adapted into a schema as needed.

22
Q

most big orgs have an EDW… WHAT IS THIS?

A

Enterprise data warehouse

23
Q
A