4. Classification Flashcards
(5 cards)
Purpose of Classification (and by extension Purpose of Logistic Regression)
Predicting qualitative or categorical outcome of Y
(Linear would be possible with dummy variable, but interpretation difficult…)
(Linear regression assumes an outcome variable that is continuous or quantitative, therefore any implied ordering is not real )
Classification Techniques
- Decision Tree
- Bagging, RF, Boosting
- Neural Networks
- Logistic Regression (continuous, discrete variables)
- Support Vector Machines (SVM)
Logistic function and log odds function
p(x)
and
log (odds) = logistic regression coefficients
How to estimate coefficients in logistic regression
Maximum Likelihood method
1. Produce S-shaped curve
2. Product ( individual likelihoods pi ) * Product ( (1 - pi) )
( or alternatively add the log-likelihoods )
Classification model performance measures w/ formula
- (P) Precision: TP / TP + FP
- (TP) True positive/ Recall/ Sensitivity: TP / TP + FN
- (TN) True negative rate (Specificity): TN / TN + FP
- (FP) False positive rate: FP / FP + TN
- F-measure: (2RecallPrecision)/(Recall+Precision)