D204 Flashcards

1
Q

An analyst defines the major questions of interest that need to be answered, determines the needs of the stakeholders, and assesses the resource constraints of the project. Define project outcomes.

A

Business Understanding/Discovery phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data is collected and stored, for easy retrieval from a database, perhaps a component of a data warehouse, by using a language like SQL. Web scraping and surveys to acquire data.

A

Data Acquisition / Collecting Data phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Also known as data wrangling, data munging, and feature engineering. Analyst will use SQL, Python, R, or Excel to perform data modifications and transformations

A

Data Cleaning phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Analyst begins to understand the basic nature of data, the relationships within it (btw data variables), the structure of the dataset, the presence of outliers, and the distribution of data values.

This phase uses data visualization tools and numerical summaries such as measures of central tendency and variability.

A

Data Exploration phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Allows the analyst to move beyond describing the data to creating models that enable predictions of outcomes of interest. Python and R are used in automating the training and use of models.

A

Predictive Modeling phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Looks for patterns in large sets of data. Tools are Python and R. Also called Machine learning.

A

Data Mining phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Analyst tells the story of the data and uses graphs or interactive dashboards to inform others of the findings from the analyses.

A

Data Reporting phase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Analytics Lifecycle phases in order

A
  1. Business Understanding/Discovery
  2. Data Acquisition / Collecting Data
  3. Data Cleaning
  4. Data Exploration
  5. The Predictive Modeling
  6. The Data Mining
  7. The Data Reporting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly