# Multiple linear regression Flashcards

disadvantage of doing regressions separately?

ignores potential synergy effect–>lead to misleading results

RSE?

sqrt(1/(n-p-1) * RSS). n-(p+1) denominator is the degrees of freedom

why does R squared increase when non-zero inputs are added?

RSS will decrease when choosing another (non-zero) parameter to estimate logic for finding a new set of coefficient based on minimising RSS

what does adjusted R squared do?

adds a penalisation factor to account for the number of predictors included in the model

formula of adjusted R square?

1-(n-1)/(n-1-p)*RSS/TSS. ALWAYS SMALLER THAN R SQAURE(can be neg)

what is the null hypothesis H0?

all regression coefficients are 0 simultaneously

formula of F stats?

((TSS-RSS)/P)/(RSS/(n-1-p))

when no relation what is F stat?

1

when to reject null hypothesis?

p-value<0.05

how does forward selection work?

- start with null model with intercept but no predictor
- successively include most informative variable (lowest RSS, highest R square)
- stop when stopping rule is reached (all variables have p-value<0.05)

how does backward elimination work?

- start with full model with intercept and all predictors
- successively remove least informative variable (highest RSS, lowest R squared)
- stop when stopping rule is reached (all variables have p-value<0.05)

how does cross validation work?

- split dataset into training and testing set
- train model using training set
- validate fitted model using testing set

how is validation error rate assessed?

mean squared error (1/n*RSS)

process of leave one out CV?

1,fit training data (obs=n-1) into a model

- validate the model using testing set (obs=1)
- compute to test MSE for first round
- repeat 1-3 for n times to obtain n MSEs
- construct LOOCV estimate as avg for MSEs

K-fold CV?

- randomly split observations into k groups
- fit training data (obs=n-n1) into a model
- validate the model using testing set (obs=n1)
- compute to test MSE for first round
- repeat 2-4 for k times to obtain k MSEs
- construct K-FOLD CV estimate as avg of k MSEs