multiple regression Flashcards

(17 cards)

1
Q

What is multiple linear regression?

A

A model predicting a continuous response (Y) using two or more predictors (X₁, X₂, …).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is multiple linear regression equation?

A

Y=β0+β1X1+β2X2 +⋯+βpXp+ϵ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How are categorical variables (e.g., TrawlDepth) handled in regression?

A

Converted to dummy variables (0/1 coding).

Baseline level: Omitted (e.g., “Bottom Trawl”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is multicollinearity, and how do you detect it?

A

Definition: Predictors are correlated, inflating standard errors.

Detection:

VIF > 5 (or 10) (from car::vif(model)).

High pairwise correlations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you check model assumptions in R?

A

plot(model) # Check:
1. Residuals vs. Fitted (linearity)
2. Q-Q Plot (normality)
3. Scale-Location (homoscedasticity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What’s the difference between R² and Adjusted R²?

A

R²: Proportion of variance explained (inflates with more predictors).

Adjusted R²: Penalizes extra predictors. Use for model comparison!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you test if a factor variable (e.g., Sediment) is significant?

A

Use ANOVA F-test:
anova(lm(WeightA ~ Depth + SST + Sediment, data=fish))
H₀: All Sediment coefficients = 0.

Reject H₀ if p < 0.05.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the matrix form of the regression equation?

A

^Y =Xβ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you calculate a 95% CI for a coefficient?

A

β^±tα/2,df×SE(β^)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the difference between Type I and Type II SS?

A

Type I: Sequential (order matters).

Type II: Tests each predictor after accounting for others (order-independent).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you handle non-linear relationships in multiple regression?

A

Add polynomial terms (e.g., Depth + Depth²).

Use transformations (e.g., log(Y)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s the key pitfall of stepwise selection?

A

Overfitting! It may capitalize on noise. Always validate with theory or a holdout dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you compare non-nested models (e.g., Y ~ X1 + X2 vs. Y ~ X1 + X3)?

A

Use AIC/BIC (lower = better). F-tests only work for nested models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What’s the interpretation of the intercept?

A

Expected Y when all predictors = 0. Often meaningless if X=0 is unrealistic (e.g., depth=0).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why might a predictor change significance when others are added?

A

Due to confounding or collinearity. Example: TrawlDepth was significant alone but not with SST/Depth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you fit a multiple regression model in R?

A

model <- lm(WeightA ~ Depth + SST + factor(TrawlDepth), data=fish)
summary(model) # Coefficients, p-values
confint(model) # 95% CIs

17
Q

How do you check for multicollinearity?

A

car::vif(model)