Use Case and Evaluation Flashcards
What is a Data Science Use Case (DSUC)?
A scenario or project that creates value uniquely using data-driven insights.
Why is identifying DSUCs important?
It helps organizations increase gain, reduce risk, and decrease effort.
What are the key steps in identifying a DSUC?
Define the problem, collect ideas, structure the ideas, define success, and assess potential risks.
What are operational-related DSUCs?
Use cases focused on optimizing operations, predicting failures, and improving product quality.
What are fraud-related DSUCs?
Use cases detecting unauthorized access, fraudulent behavior, and security threats.
What are customer-related DSUCs?
Use cases focused on improving customer experience, predicting churn, and optimizing marketing strategies.
What are the two types of evaluation for DSUCs?
Model-centric evaluation and business-centric evaluation.
What is model-centric evaluation?
Evaluating the predictive model’s performance using metrics like accuracy, precision, and recall.
What is business-centric evaluation?
Evaluating the impact of a model on business KPIs such as revenue, customer retention, and operational efficiency.
What is the Machine Learning Canvas?
A structured framework used to define, plan, and evaluate machine learning projects.
What are the key components of the Machine Learning Canvas?
Prediction Task, Decisions, Value Proposition, Data Collection, Data Sources, Impact Simulation, Making Predictions, Building Models, Features, and Monitoring.
What is customer churn?
The rate at which customers stop doing business with a company over a certain period.
Why is customer churn important to businesses?
Reducing churn helps retain valuable customers and improves profitability.
What type of machine learning task is customer churn prediction?
A supervised learning binary classification problem.
What features are used in churn prediction models?
Customer demographics, purchase history, subscription details, engagement levels, and payment history.
What data sources are used for churn analysis?
CRM databases, payment records, and website analytics.
How is model performance evaluated in churn prediction?
Using metrics such as accuracy, precision, recall, and F1-score.
What are the key steps in the machine learning workflow?
Feature extraction, data splitting, model training, evaluation, and deployment.
What is overfitting in machine learning?
When a model performs well on training data but poorly on unseen data due to memorization.
How can overfitting be prevented?
Using techniques like regularization, cross-validation, and reducing model complexity.
What is accuracy in model evaluation?
The proportion of correctly classified instances out of all predictions.
What are the limitations of accuracy?
It does not account for class imbalances, which may lead to misleading results in fraud detection.
What is a confusion matrix?
A table used to evaluate classification models by displaying true positives, false positives, true negatives, and false negatives.