IFN580 Week 3: Supervised Learning (11%) Flashcards
(16 cards)
What is overfitting?
When the model memorises the training data, rather than finding the patterns
What is underfitting?
When the model overlooks the underlying patterns in the training data
What is bias?
Errors due to oversimplistic assumptions in the model - leads to underfitting
What is variance?
The variability of the model - how much of a model’s predictions will change if it’s trained on a different training set?
An overfitting model will have _ bias, _ variance
Low bias, high variance
An underfitting model will have _ bias, _ variance
High bias, low variance
What is bias-variance tradeoff?
The balance between bias and variance that affects generalisation. Low bias and Low variance are ideal
What is training data?
Used to train the model
What is test data?
Used to estimate the models performance
What is validation?
Used to monitor the model’s performance during training
What is batch testing?
Splitting the data into training and test sets
What is N-fold cross validatrion?
Splitting the data into N parts and training model on n-1 parts
What is random sampling, and when should it be used?
Randomly selecting data points, use when data is balanced
What is stratified sampling, and when should it be used?
Use the same proportions in both training and test data, use when data is skewed
What’s the formula for Precision/Accuracy?
CORRECT PREDICTIONS / TOTAL PREDICTIONS
OR
TP+FP / TP+FP+TN+FN
What’s the formula for RECALL?
TP / TP+FN