L8 - Gradient Descent and Classifier Performance Flashcards

1
Q
  1. What is fitting a model?
A
  1. The process of finding parameters such that the model fits to to the data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. In classification, what are 2 sub-optimal approaches to fitting a model?
A
  1. Random Sampling Values
    1. Grid Search
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Through what process can we improve parameter fitting?
A
  1. Learning
    1. Minimise error and maximise fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. When parameter fitting, what is the goal?
A
  1. To minimise error of the model
    1. I.e find the minimum error point
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. When establishing the minimum error using a Generic Parameter Fitting model, when do we stop?
A
  1. When the error starts increasing again.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. In Deterministic Parameter Fitting, when do we stop?
A
  1. When we reach N steps since everything is determined.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Describe the Stochastic Parameter Fitting method…
A
  1. Pick a random point P1 and calculate L2 norm
    1. Pick a random point P2 near P1 and calculate L2 norm
    2. Repeat until L2 norm is less than an error threshold or whether N steps can been taken.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. What is the purpose of the gradient descent algorithm?
A
  1. Iteratively find the minimum of a function or model.
    1. In this context, to find the minimum error of the classification model.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. When does the gradient descent algorithm stop?
A
  1. When the error is 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. Describe the steps of the gradient descent algorithm
A
  1. Start at random point P1
    1. Calculate loss of P1
    2. Choose P2 in the direction where the loss has the steepest gradient
    3. Repeat until loss value is below a threshold or N steps have been taken
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Why do we want to minimise the loss function of a model?
A
  1. In order for the model to better fit the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. What is Precision? And what does it mean if a model has High Precision?
A
  1. Positive Prediction Value -> I.e Those labeled as positive is indeed positive
    1. Confusion matrix shows few number of False Positives
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. What is Recall? And what does it mean if a model has High Recall?
A
  1. Recall is sensitivity -> Out of all the actual positive instances, how many did the model correctly identify?
    1. Measures how well the model avoids FN’s.
    2. Model is good at finding positive instances, but may not be precise.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. What are the equations for Precision and Recall?
A
  1. Precision = TP / ( TP + FP )
    1. Recall = TP / ( TP + FN )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  1. What does it mean if a model has High Recall and Low Precision?
A
  1. Most positive instances are classified, but there’s likely to be many false positives.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. What does it mean if a model has Low Recall and High Precision?
A
  1. Misses a lot of positive instances, but the TP’s predicted are likely correct.
17
Q
  1. What is the F1 measure?
A
  1. Score that combines the Precision and Recall measures.
18
Q
  1. In a classification model, what is the difference between Specificity and Sensitivity?
A
  1. Sensitivity -> Measures the models ability to identify positive instances, but may not be precise.
    1. Specificity -> Measures TN accuracy.
19
Q
  1. What does the ROC curve show?
A
  1. Combines specificity and sensitivity to show the trade off between TP and FP rate
20
Q
  1. What is the threshold of the ROC curve? How do we decide it?
A
  1. Threshold is hyper parameter that is set to create new TP and FP values.
    1. Threshold is a design choice -> Asks the question, which mistake is worse, a false positive or a false negative?
21
Q
  1. Regarding TP and FP, what is the perfect classifier?
A
  1. High TP and Low FP
22
Q
  1. What is the ideal value of Area Under Curve?
A
  1. Close to 1 as possible -> Higher TP rate
23
Q
  1. How is a ROC curve created?
A
  1. Iteratively move threshold across the classification graph, calculating the sensitivity and specificity at each point.
    1. Plot the results on a scatter graph.
    2. The result is the ROC curve
24
Q
  1. What is the main metric that can be calculated from the ROC curve?
A
  1. Area under curve -> Tells us ability to calculate True Positives.