Use Case and Evaluation Flashcards

1
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Data Science Use Case (DSUC)?

A

A scenario or project that creates value uniquely using data-driven insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is identifying DSUCs important?

A

It helps organizations increase gain, reduce risk, and decrease effort.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the key steps in identifying a DSUC?

A

Define the problem, collect ideas, structure the ideas, define success, and assess potential risks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are operational-related DSUCs?

A

Use cases focused on optimizing operations, predicting failures, and improving product quality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are fraud-related DSUCs?

A

Use cases detecting unauthorized access, fraudulent behavior, and security threats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are customer-related DSUCs?

A

Use cases focused on improving customer experience, predicting churn, and optimizing marketing strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two types of evaluation for DSUCs?

A

Model-centric evaluation and business-centric evaluation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is model-centric evaluation?

A

Evaluating the predictive model’s performance using metrics like accuracy, precision, and recall.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is business-centric evaluation?

A

Evaluating the impact of a model on business KPIs such as revenue, customer retention, and operational efficiency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Machine Learning Canvas?

A

A structured framework used to define, plan, and evaluate machine learning projects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the key components of the Machine Learning Canvas?

A

Prediction Task, Decisions, Value Proposition, Data Collection, Data Sources, Impact Simulation, Making Predictions, Building Models, Features, and Monitoring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is customer churn?

A

The rate at which customers stop doing business with a company over a certain period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is customer churn important to businesses?

A

Reducing churn helps retain valuable customers and improves profitability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What type of machine learning task is customer churn prediction?

A

A supervised learning binary classification problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What features are used in churn prediction models?

A

Customer demographics, purchase history, subscription details, engagement levels, and payment history.

17
Q

What data sources are used for churn analysis?

A

CRM databases, payment records, and website analytics.

18
Q

How is model performance evaluated in churn prediction?

A

Using metrics such as accuracy, precision, recall, and F1-score.

19
Q

What are the key steps in the machine learning workflow?

A

Feature extraction, data splitting, model training, evaluation, and deployment.

20
Q

What is overfitting in machine learning?

A

When a model performs well on training data but poorly on unseen data due to memorization.

21
Q

How can overfitting be prevented?

A

Using techniques like regularization, cross-validation, and reducing model complexity.

22
Q

What is accuracy in model evaluation?

A

The proportion of correctly classified instances out of all predictions.

23
Q

What are the limitations of accuracy?

A

It does not account for class imbalances, which may lead to misleading results in fraud detection.

24
Q

What is a confusion matrix?

A

A table used to evaluate classification models by displaying true positives, false positives, true negatives, and false negatives.

25
What is a Type I error (False Positive)?
Incorrectly classifying a negative instance as positive.
26
What is a Type II error (False Negative)?
Incorrectly classifying a positive instance as negative.
27
What is precision in classification?
The proportion of true positives among all predicted positives (TP / (TP + FP)).
28
What is recall in classification?
The proportion of actual positives that were correctly predicted (TP / (TP + FN)).
29
What is the F1-score?
The harmonic mean of precision and recall, balancing both metrics.
30
What are techniques for improving model performance?
Dimensionality reduction, hyperparameter tuning, and ensemble methods.
31
What is dimensionality reduction?
Reducing the number of features in a dataset to remove redundant or irrelevant information.
32
What are common dimensionality reduction techniques?
Principal Component Analysis (PCA) and feature selection methods.
33
What is hyperparameter tuning?
Optimizing the configuration settings of a model to improve performance.
34
What are ensemble methods?
Techniques that combine multiple models to improve predictive accuracy, such as bagging and boosting.
35
What is live evaluation in machine learning?
Continuously tracking model performance on real-world data to detect drift and degradation.
36
What is Return on Investment (ROI) in data science?
The financial benefit gained from implementing a data science solution relative to its cost.
37
Why is monitoring machine learning models important?
To ensure that model predictions remain accurate and aligned with business objectives.