Lecture 1 - Recap and Neural Networks Flashcards
(31 cards)
What best describes supervised learning?
Data points have a known, correct label.
What type of variable does regression predict?
A continuous variable.
What type of variable does classification predict?
A categorical variable.
What best describes unsupervised learning?
Data points have no label; patterns are discovered in the data.
What best describes semi-supervised learning?
Some data points have labels and others do not.
What best describes reinforcement learning?
An agent learns from its environment by receiving rewards for actions.
What is the purpose of a training set in ML?
To adjust the weights of the model.
What is the purpose of a validation set in ML?
To tune hyperparameters and check for overfitting.
What is the purpose of a test set in ML?
To evaluate the final performance of the model.
What is k-fold cross-validation used for?
To evaluate a model by splitting the dataset into multiple train/test folds.
What does a large gap between training and test error indicate?
High variance in the model.
What can help reduce test error?
More training data.
What is bias in a model?
The error due to overly simplistic assumptions in the learning algorithm.
What is variance in a model?
The error due to sensitivity to small fluctuations in the training set.
What is accuracy in classification?
The ratio of correct predictions to total predictions.
What is precision in classification?
The ratio of true positives to all predicted positives.
What is recall in classification?
The ratio of true positives to all actual positives.
What does F1-score balance?
Precision and recall.
Why is recall more important than precision in disease diagnosis?
Because false negatives are riskier than false positives.
What are examples of supervised learning algorithms?
Linear regression, logistic regression, SVM, decision trees, random forests, neural networks.
What are examples of unsupervised learning algorithms?
K-means clustering, hierarchical clustering, PCA, autoencoders.
Which method is commonly used for semi-supervised learning?
Self-learning: 1) Train on data with labels; 2) Predict labels on unlabeled data; 3) Train again on the newly labeled dataset.
Is it correct that having a high number of correctly classified training samples means correctly classified test samples?
Yes, as long as the model does not overfit during training. In general, it is hard to find the ideal number of epochs.
What is the formula for accuracy?
Accuracy = (TP + TN) / (TP + FP + TN + FN)