POWERPOINT 6 Flashcards
any function where you input X and it outputs Y; as a predicted response at X
Ex: least squares line
prediction rule
unpredictable effects: error that are uncorrelated with X
inaccuracies in specifying Y’- would we get the same line if we’d seen a different collection of houses
sources of uncertainty
probable range for Y-values given X
prediction interval
Y = β0 + β1X + ε
β0, β1 (linear pattern)
σ (variation around the line)
simple linear regression model
the probability of a normal distribution is within μ ± 2σ; PI = β0 + β1X ± 2σ
95% prediction interval
means that knowing εi doesn’t affect your views about εj
independence
means that we are using the same normal for every εi
identically distributed
- Mean of Y is linear in X.
- Error terms (deviations from line) are normally distributed (very few deviations are more than 2 sd away from the regression mean).
- Error terms have constant variance.
key characteristics of linear regression model
βˆ1 = b1 = rxy × sy/sx
βˆ 0 = b0 = Y ̄ − b 1 X ̄
we use least squares to estimate β0 and β1
n
s^2 = 1/n − 2
estimation of variation
the standard deviation of an estimate; it determines how close b1 is to β1
sb1^2 = s^2/
standard error of b1
intercept is also normal and unbiased
sampling distribution of b0
- normal and unbiased
- as the sample size n increases, we get more certain about b1
- as the error variance s^2 increases, we get less certain about b1
- as the spread of X increases (sx), we get less certain about b1
sampling distribution of b1
sb0^2 = var(b0) = s^2 (1/n + X ̄^2/(n-1)sx^2)
standard error of b0
The confidence interval provides you with a set of plausible values for the parameters
Since b1 ∼ N(β1,s2 ), Thus: b1
68%ConfidenceInterval:b1±1×sb1 95%ConfidenceInterval:b1±2×sb1 99%ConfidenceInterval:b1±3×sb1
Same thing for b0
95%ConfidenceInterval:b0±2×sb0
Confidence Intervals