Midterm 1 Flashcards

(43 cards)

1
Q

The model learns the relationship between inputs and outputs by minimizing _____ between predicted and actual values

A

the difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Supervised learning in regression tasks involves fitting the data to a _____ line using _____ data to predict an output y=h(x) from a given input x.

A

Straight, labeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The gradient descent algorithm is an optimization method used to minimize a cost function by iteratively updating model parameters in the direction of the _____, which is the _____ of the function

A

Steepest descent, negative gradient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The update step size is controlled by the _____ , and the algorithm continues until convergence, which is typically defined by a sufficiently small change in the cost function

A

Learning rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

(T/F) Logistic regression can only be used for binary classification problems

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

(T/F) The output of logistic regression is a probability value between 0 and 1

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(T/F) Logistic regression does not assume any relationship between the input features and the output

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

(T/F) Logistic regression uses the sigmoid function to model the probability of a class

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

(T/F) SVMs aim to find a decision boundary that maximizes the margin between classes

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

(T/F) The SVM cost function can be approximated by piecewise linear functions, though this increases computational complexity

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

(T/F) The data points closest to the decision boundary are called support vectors

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

(T/F) SVMs achieve better general by maximizing the margin of separation

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

(T/F) Overfitting occurs when a model is too complex and captures noise in the training data

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

(T/F)A model that overfits will have high training accuracy but poor test accuracy

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

(T/F) Overfitting typically happens when the model has too few parameters relative to the training data

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

(T/F) Regularization techniques like L1 or L2 can help prevent overfitting by simplifying the model

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

(T/F) A model that over fits will perform well on both training and test data

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

(T/F) Underfitting typically happens when the model has too many parameters relative to the training data

18
Q

(T/F) Underfitting occurs when a model is too simple to capture the underlying patterns in the data

19
Q

(T/F) Underfitting occurs when a model learns only the noise in the training data

20
Q

(T/F) A model that underfits will have both low training accuracy and low test accuracy

21
Q

(T/F) Increasing model complexity can help address underfitting

22
Q

(T/F) Regularization techniques add a penalty term to the model cost function to reduce the risk of overfitting

23
Q

(T/F) The regularization parameter controls the strength of regularization applied to the model

24
(T/F) Small regularization parameters may allow the model to overfit the training data
True
25
(T/F) Too large regularization parameters may result in undercutting by making the model too simple
True
26
(T/F) Regularization can help prevent overfitting even when the training data is small, though more data may be needed for optimal test results
True
27
(T/F) Increasing regularization always improves model performance, regardless of the situation
False
28
(T/F) Removing irrelevant or redundant features can help prevent overfitting and improve model generalization
True
29
(T/F) Adding more training data can reduce overfitting and improve generalization, especially for high-variance models
True
30
(T/F) Adding polynomial features is a good strategy to fix high-variance (overfitting) problems
False
31
(T/F) Reducing the number of features by removing irrelevant ones is a method of reducing the risk of overfitting in a machine learning model
True
32
(T/F) Increasing the regularization parameter(λ) is a method of reducing the risk of overfitting in a machine learning model
True
33
(T/F) Collecting more training data is a method of reducing the risk of overfitting in a machine learning model
True
34
(T/F) Using a more complex model with higher polynomial features is a method of reducing the risk of overfitting in a machine learning model
True
36
Formula for accuracy?
(TP + TN) / (TP + FP + TN + FN)
37
When do we use regression?
When value is continuous, trying to predict value
38
When do we use classification?
When data is labeled, trying to categorize data
39
What is the evaluation metric for regression?
MSE, RMSE, R-Squared
40
What is the evaluation metric for classification?
Accuracy, Precision, Recall, F-1 Score
41
Formula for precision?
TP/(TP + FP)
42
Formula for recall?
TP/(TP + FN)
43
Formula for F1-Score?
2 * [ (Precision * Recall) / (Precision + Recall)