correlation Flashcards

Question 1

Q

What does Pearson’s correlation coefficient (r) measure?

Answer

A

Range:

−1 (perfect negative) to

+1 (perfect positive).

Interpretation:
r=0: No linear relationship.

r=0.868 (e.g., LLL vs. height): Strong positive relationship

Question 2

Q

How is covariance different from correlation?

Answer

A

Covariance: Measures direction but not strength (units-dependent).

Correlation: Standardized covariance (unitless, −1≤r≤1).

Question 3

Q

How do you test if a correlation is significant?

Answer

A

H0:ρ=0 (no correlation).

H1:ρ≠0

t stat = rsqrt(n-2) / sqrt(1-r^2)

cor.test(x, y)

Question 4

Q

Does correlation imply causation?

Answer

A

No! Possible explanations:

Chance (spurious correlation).

Confounding

True causation.

Question 5

Q

What is the simple linear regression equation?

Answer

A

Yi=β0+β1Xi+ϵi

Question 6

Q

What is the least squares criterion?

Answer

A

Minimizes the sum of squared residuals:
SSRes = ∑(yi−y^i)^2

Residuals: Vertical distances from data points to the regression line.

Question 7

Q

How is model variance (σe2) estimated?

Answer

A

= 1/n−2 ∑(yi−y^i)^2

Question 8

Q

What does R-squared (R^2) measure?

Answer

A

Proportion of variance in Y explained by X:

R^2 = SSmodel / SStotal = 1 - SSred/SStotal

Example: R^2=0.75 → 75% of variation in Y is explained by X

Question 9

Q

When is the intercept (β0) meaningless?

Answer

A

When X=0 is outside the data range (e.g., negative folate levels).

Question 10

Q

How do you predict values using regression?

Answer

A

Plug X into the fitted equation: Y^= β^0+β^1X

Question 11

Q

What are the assumptions of linear regression?

Answer

A

Linearity: Relationship between X and Y is linear.

Independence: Residuals are uncorrelated.

Homoscedasticity: Constant residual variance.

Normality: Residuals∼N(0,σ2)∼N(0,σ2).

Question 12

Q

How do you check regression assumptions in R?

Answer

A

plot(model) # Check:
1. Residuals vs. Fitted (linearity).
2. Q-Q Plot (normality).
3. Scale-Location (homoscedasticity).

Question 13

Q

What is the F-test in regression used for?

Answer

A

Tests if the model explains significant variance:
H0: All slopes = 0.

H1 : At least one slope ≠0

R output: F-statistic and p-value.

Question 14

Q

Why avoid extrapolation in regression?

Answer

A

Predictions outside the observed
X
X range may be invalid (e.g., SST = 150°C).

correlation Flashcards

(14 cards)