Model Evaluation, Hyperparameter Tuning, Classification & Regression Metrics Flashcards
What is the solution to model evaluation problems?
Split the data into training, validation, and test sets.
What is a Training Set?
Used to train the model.
What is a Validation Set?
Used during training to tune hyperparameters.
What is a Test Set?
Used after training to check final performance.
What is the Holdout method in model validation?
One-time split: Train (e.g. 60%), Val (20%), Test (20%).
When is the Holdout method best used?
For large datasets.
What is K-Fold Cross Validation (KCV)?
Split data into k parts, rotate training/testing.
When is K-Fold Cross Validation best used?
Best for small data, better accuracy.
What is Overfitting?
Too good on training, bad on new data.
What is Underfitting?
Bad on both training and new data.
What are the characteristics of Overfitting?
High variance, memorizing.
What are the characteristics of Underfitting?
High bias, guessing.
What is Early Stopping?
Stop training when validation loss goes up.
What is L2 Regularization?
Penalizes large weights to keep the model simple.
What is the L2 Regularization formula?
λ * Σ(weights²) → encourages smaller weights.
What are Hyperparameters?
Settings you pick before training (not learned from data).
Give examples of Hyperparameters.
- Learning rate
- Batch size
- Number of layers
- Activation functions
What is Grid Search?
Try every combo of settings.
When is Grid Search effective?
Good for small search spaces.
What is a disadvantage of Grid Search?
Super slow if too many options.
What is Random Search?
Pick random combos.
When is Random Search better?
Better for large/continuous spaces.
What are Classification Models used for?
To assign a class (label) to data.
What is an example of a Classification Model?
Is this email spam? Is the tumor benign or malignant?