Lecture 3: Regression Assumptions Flashcards
(36 cards)
Why are statistical assumptions important in regression?
To ensure reliable and generalisable inferences.
Are all assumption violations equally serious?
No.
In large samples, are assumption violations usually a major concern?
No.
What does the linearity assumption require?
A linear relationship between predictors and the outcome.
What can be used to model nonlinearity?
Transformations or polynomial terms.
What happens if the true relationship is nonlinear and uncorrected?
The model may misrepresent the data.
What does the normality assumption refer to in regression?
Normal distribution of residuals.
Is normality of residuals important for large samples?
No.
Do non-normal residuals usually bias regression estimates?
No.
What do simulation studies suggest about regression with skewed data?
It remains robust.
What type of outlier is extreme on one variable?
Univariate outlier.
What type of outlier has an unusual combination of variable values?
Multivariate outlier.
What statistic is used to measure the influence of a data point?
Cook’s distance.
What value of Cook’s distance indicates high influence?
Greater than 1.
In small samples, are outliers more problematic?
Yes.
What does homoskedasticity mean?
Equal variance of residuals across predictor values.
What is it called when variance of residuals is unequal?
Heteroskedasticity.
What kind of error is inflated by heteroskedasticity?
Type I error.
Name one test for heteroskedasticity.
White test.
What transformation can help with long-tailed data?
Log transformation of the dependent variable.
What is one way to correct inference under heteroskedasticity?
Heteroskedasticity-consistent standard errors.
What is the most critical assumption in regression?
Independence of observations.
What are common causes of non-independence?
Clustering and repeated measures.
What does non-independence do to variability estimates?
Underestimates variability.