What does the correlation coefficient r NOT tell you?
It does not describe slope, shape, or linearity of a relationship.
When is correlation misleading?
When x and y are not independent (pseudoreplication).
Why is correlation inappropriate for non-linear data?
r only measures linear relationships.
How do outliers affect correlation?
Outliers can strongly distort r.
Why can subgroups/clusters mislead correlation?
Groups can artificially inflate or mask the true association.
Why shouldn’t you extrapolate correlation beyond your data range?
r is only valid within the observed x–y range.
Does correlation imply causation?
No — correlation never proves causation.
What is a statistical model?
A mathematical representation of a relationship: outcome = model + error.
What is assumed normally distributed in regression models?
The error, not necessarily the outcome variable.
What are residuals?
Differences between each data point and the model prediction (e.g., mean).
What does the mean minimise in modelling?
The sum of squared residuals (SSR).
What does simple linear regression model?
The linear relationship between y (dependent) and x (independent).
What does the slope coefficient b₁ represent?
Change in y for each 1-unit change in x.
What does SST represent?
Total variability in the dependent variable.
What does SSM represent?
Variability explained by the regression model.
What does SSR represent?
Residual unexplained variability (error).
What is R²?
The proportion of total variability in y explained by x: R² = SSM / SST.
How is model fit tested?
Using an F-ratio: F = MS_model / MS_residual.
How is the significance of the slope b₁ tested?
t = b₁ / SE(b₁), with df = N − 2.
What does homoscedasticity mean?
Residual variance is similar across all x values.
Name three ways to reduce bias in a study.
Double-blinding, coded treatments, coded analysis.