Logistic Regression Flashcards

1
Q

General characteristics Logistic regression

A

Outcome must be between 0 and 1.
Sigmoid logistic function used to model dependent variable.
Coefficients estimated through MLE.

Not possible to interpret the coefficients directly because it is on an exponential basis.

logit = logarithm of the odds, the log-odds, coefficients are the change in the log-odds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Odds Ratio

A

Divide the odds of the first group by the odds of the second group.

Odds of being interested in the product if the client is active, over the Odds of being interested in the product if the client is NOT active.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Regularisation

A

Overfitting, when data has high dimensions and is sparse.

Regularisation penalises large values of the estimates. If weights are large, a small change in feature can lead to a large change in the prediction, so we use the penalised log-likelihood function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

L1 regularisation

A
  • Lasso Regression, Linear. Can also be used to do feature selection (automatically select important predictors in the model) by forcing some coefficients to be zero.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

L2 regularisation

A
  • Ridge Regression. More “soft”, no feature selection. If you think all predictions are important, use this one.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Setting a threshold

A

Quantify the costs of the misclassifications (FP and FN), this way you can allocate a budget for the misclassification of the model and use ROC curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Techniques for dealing with unbalanced data

A

Oversampling: SMOTE
Undersampling.
Weighted logistic regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly