MLR Flashcards

1
Q

Definition

A

Functional Form for f: Y = B_0 + B_1x_1 + … + B_px_p + e
–GLM with normal distribution and identify link

B’s are regression coefficients; they are free parameters that require estimation using training data. B_0 is the intercept.
–Objective function = SSE which is deviance for MLR
–Estimate B’s with b’s using OLS where the SSE is minimized = Sum (y - y^)^2 on training data.

Assumes e is normally distributed with mean 0 and variance sigma^2 which leads to Y being normally distributed.

sigma^2:
–is a constant for all observations; known as homoscedasticity
–needs to be estimated as well = residual standard error = sqrt( sum(y-y^)^2 / (n -p - 1)
–>A lower residual standard error is preferred, indicating less influence from the random component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Model Performance

A

RMSE is a common measure but alternatives like MSE can also be used.
Residual analysis looking for model violations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interpretation

A

b_j is the change in y^ for every unit increase in predictor x_j, assuming all other predictors are held constant

b_0 is the intercept. It represents when all factors are the base level and all numeric variables are 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hypothesis Tests

A

Used in place of test RMSE when no test dataset available; only compare MLR modeling; Comparison models must be nested.

t:
–Used to check whether it is more plausible for a B to be 0 or non-0.
–A high p-value favors the ‘all 0’ scenario; a low p-value favors the ‘at least one non-0’ scenario
–Cannot compare two levels of a factor that exclude the reference level; consider the impact of dropping a level which effectively combines it with the base level.
–Hierarchical principle states that if an interaction term should be included in the model, then its individual terms should also be included regardless of the individual terms’ p-values.

F:
–Used to check, for multiple B’s, whether it is more plausible for all of them to be 0 or at least one of them to be non-0.
–Does not identify which one/ones are predictive
–A high p-value favors the ‘all 0’ scenario; a low p-value favors the ‘at least one non-0’ scenario

How well did you know this?
1
Not at all
2
3
4
5
Perfectly