C3 Data Science Methodology Flashcards

1
Q

Define Methodology

A

A system of methods used in a particular domain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

10 stages of Data Science Methodology?

A

Get Business understanding.
Determine Analytic approach.
Data requirements.
Data collection.
Data understanding.
Data preparation.
Modeling.
Evaluation.
Deployment.
Feedback.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

(1.) Business Understanding asks?

A

What is the problem that you’re trying to solve?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

An (2.) Analytic Approach asks?

A

How can I use data to answer the question?

or,

What patterns address the question most effectively?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Evaluating (3.) Data Requirements asks?

A

What data do you need to answer the question?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Planning (4.) Data collection asks?

A

Where will I get the data that I need, and how will I ingress it?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(5.) Data Understanding asks?

A

Does the data I’ve collected represent the problem to be solved?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

During (6.) Data Preparation ask?

A

What additional work is required to manipulate and work with the data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When (7.) Modeling, ask?

A

When you apply your data visualizations, do you see answers that address the business problem?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

During (8.) Evaluation ask?

A

Does the data model answer the initial business question or must you adjust the data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

During (9.) Deployment ask?

A

Can you put the model into practice?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When seeking (10.) Feedback, ask?

A

Can I get constructive feedback from the data and the stakeholder to answer the business question?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a cohort?

A

A group that shares a common characteristic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is CRISP-DM?

A

Cross-Industry Standard Process for Data Mining.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Predictive models do what?

A

Tell us the probability of a future event based on historical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In broad terms, Descriptive Models do what?

A

Summarize data, without making predictions

17
Q

What is a Feature?

A

A characteristic or attribute developed within the data that helps in solving the problem.

18
Q

What are, Pairwise Correlations?

A

An analysis to determine the relationships and correlations between different variables.

19
Q

What is, Text Analysis?

A

Analyzing and manipulating textual data, & extracting meaningful information and patterns.

20
Q

Descriptive Analytics tells us?

A

What happened. Hindsight.

Low value & low difficulty.
Provides information.

21
Q

Diagnostic Analytics tells us?

A

Why did it happen. Insight.

Middling value & difficulty.
Provides information.

22
Q

Predictive Analytics tells us?

A

What will happen. Forsight.

High value & high difficulty.
Helps us optimize.

23
Q

Prescriptive Analytics tells us?

A

How we can make it happen. Greater foresight.

Highest value & difficulty.
Helps us optimize.

24
Q

What does ROC curve stand for?

A

Receiver Operating Characteristic Curve.

First developed during World War II to detect enemy aircraft on radar.

25
Q

Senesitivity formula?

A

Sensitivity = TP / P = TP / (TP + FN)

Also known as:
Hit Rate
True Positive Rate
Recall

26
Q

Specificity formula?

A

Specificity = TN/N = TN / (TN+FP)

Also known as:
Selectivity
True Negative Rate

27
Q

What is PPV?

A

Positive Predictive Value.

28
Q

What is NPV?

A

Negative Predictive Value.

29
Q

PPV formula?

A

TP / (TP+FP) = Positive Predictive Value

29
Q

What is a Confusion Matrix?

A

A matrix thus:

|TP|FN|
————
|FP|TN|

29
Q

NPV formula?

A

TN / (TN+FP) = Negative Predictive Value.

30
Q

Type I error?

A

A False Positive.

31
Q

What is a Type II Error?

A

A False Negative.

32
Q

List the CRISP-DM steps.

A

Business Understanding.
Data Understanding.
Data Preparation.
Modeling.
Evaluation.
Deployment.

33
Q
A