L5 - Improving Predictive Models Flashcards

Question 1

Q

What is an improved model?

Answer

A

Improved can have different meanings.
* Simpler, faster to run
* More accurate estimates

Question 2

Q

How is simple model validation preformed and what are its short comings?

Answer

A

Devide the data into traing and testing data (typically 30-70 split).
Calculate the error by finding the difference between model predictions and the test data outputs.

Issue! Model may not generalise well.

Question 3

Q

What is cross validation?

Answer

A

Cross-validation is a technique used in machine learning to evaluate the performance of a model on unseen data. It involves dividing the available data into multiple folds or subsets, using one of these folds as a validation set, and training the model on the remaining folds. This process is repeated multiple times, each time using a different fold as the validation set. Finally, the results from each validation step are averaged to produce a more robust estimate of the model’s performance.

Question 4

Q

Outline the process of k-fold cross validation

Answer

A

Randomly divide the data set into k subsets (folds)
Reserve 1 set for validation. Train the model using the other sets.
Repeat with a different set reserved for validation.
Repeat until all sets have been used for validation once.
Calculate k-folder loss

k-folder loss = (1/k)SUM(loss)

Question 5

Q

What are Hyperparameters?

Answer

A

Hyperparameters are configuration variables that are set before the training process of a machine learning model begins. They control the learning process itself, rather than being learned from the data. Hyperparameters are crucial for tuning the performance of a model and can significantly impact its accuracy, generalization, and other metrics.

Question 6

Q

Why is it desirable to reduce the number of predictors in data?

Answer

A

Data can have hundereds or thousands of predictors. This makes learning algoritms computationaly intensive and the resulting models complex.
Methods:
1. Transform Features
2. Select Features

Question 7

Q

What is ensemble learning?

Answer

A

Multiple weak learning models (like decision trees) can be grouped together to compare all their outputs creating a more robust model.

L5 - Improving Predictive Models Flashcards

(7 cards)