ML Ops Flashcards
(27 cards)
What is the main idea behind ML engineering?
Building usable, maintainable ML systems beyond just training models.
What are some components of a full ML system?
Data pipelines, trained models, APIs, monitoring tools, logs, infrastructure.
Why is training a model not enough in real-world ML?
The model must be deployed, monitored, updated, and supported in production.
What happens if ML engineering is ignored?
Models degrade, bugs go unnoticed, and systems fail in deployment.
What is one reason many ML projects fail?
They lack monitoring, robust pipelines, or stakeholder alignment.
What are the three major phases of the ML lifecycle?
Design, Model Development, and Operations.
What happens during the design phase of ML?
Use case discovery, feasibility analysis, stakeholder engagement.
What tasks are part of the model development phase?
Data cleaning, model training, hyperparameter tuning, evaluation.
What is included in ML operations?
Deployment, monitoring, infrastructure, maintenance, and decommissioning.
What does ‘data drift’ mean in ML?
The input data distribution changes over time.
What is ‘concept drift’?
The relationship between inputs and outputs evolves over time.
What is ‘data skew’ in production ML?
Sudden anomalies or shifts in data not seen during training.
Why is monitoring ML models essential?
To detect performance drops, data issues, and system failures.
What are examples of ML monitoring metrics?
Accuracy, latency, input distributions, feature statistics.
What role does a data scientist play in ML engineering?
They select features, train models, and evaluate performance.
What does an ML engineer do?
Deploys models, manages APIs, and optimizes runtime performance.
What is the responsibility of a data engineer?
Builds and manages data pipelines and ETL systems.
Why is stakeholder collaboration important in ML?
To ensure the model solves the correct business problem.
What is the benefit of using platforms like Vertex AI or SageMaker?
They provide tools for scalable, automated ML workflows.
What can ML Ops platforms help with?
Versioning, deployment, monitoring, and drift detection.
What is one advantage of cloud-based ML tools?
They automate infrastructure and scale model deployment easily.
What does the lecture say about learning cloud ML platforms?
Students should start learning them early to prepare for industry work.
What is included in a proper ML engineering workflow?
Feature extraction, model training, validation, deployment, monitoring, retraining.
Why is version control important in ML systems?
To track changes in models and datasets for debugging and reproducibility.