Chapter 3 Flashcards
(120 cards)
A machine learning pipeline is a method for fully automating a machine learning task’s ______.
workflow
A general ML pipeline consists of data input, data models, ______, and predicted outcomes.
parameters
The data analysis process includes Data Extraction, Data Preparation, Data Exploration & Visualization, Predictive Modeling, Model Validation, and ______.
Deploy
A machine learning pipeline is a series of interconnected data processing and modeling steps designed to automate, standardize and streamline the process of building, training, evaluating and ______ machine learning models.
deploying
Types of Data Sources include Databases (SQL, NoSQL), APIs and Web Scraping, IoT Devices and Sensors, and ______ Data Sets.
Public
Key considerations for Data Collection are Data Relevance, Data Volume, and Data ______.
Quality
Data Validation involves ensuring data integrity, ______, and consistency before modeling.
accuracy
Verifying that numerical values fall within expected ranges is a part of Data ______.
Validation
Data Cleaning involves handling ______ values and correcting errors.
missing
Combining data from multiple sources is known as Data ______.
Integration
Normalization, standardization, and encoding categorical variables are parts of Data ______.
Transformation
Dimensionality reduction and feature selection are components of Data ______.
Reduction
Choosing the most appropriate machine learning model for the task is called Model ______.
Selection
Factors to consider in Model Selection include Type of Problem, Data Characteristics, Model Complexity vs. Interpretability, and ______ Metrics.
Performance
Linear Models include Logistic Regression and ______ Regression.
Linear
Decision Trees, Random Forests, and XGBoost are examples of ______-Based Models.
Tree
Simple NN, CNN, and RNN are types of ______ Networks.
Neural
Key metrics for classification model evaluation include Accuracy, Precision, Recall, F1 Score, and ______.
ROC-AUC
For regression model evaluation, common metrics are MSE and ______.
RMSE
______-Validation is important for validating model performance on unseen data.
Cross
Identifying and mitigating Overfitting vs. Underfitting is crucial in Model ______.
Evaluation
Model ______ is a critical stage where the performance of a trained model is evaluated and interpreted.
Analysis
Model ______ involves evaluating the model’s performance on a validation dataset to ensure it generalizes well to unseen data.
Validation
Model ______ is the process of taking a trained machine learning model and making it available for use in a production environment.
Deployment