Regression Analysis uses a _______model to predict a ______variable (dv) by using one or more _______variables (iv).

Statistical

Response

Predictor

In regression analysis, β_{0} and β_{1} are called_______

parameters

What are the four steps of hypothesis testing?

**Step 1**:

__one-sided__: H_{0}<μ H_{a}≥μ(no linear association between x and y – not useful for predicting y)

__two-sided__: H_{0}=μ H_{a}≠μ

**Step 2**:

t=(x ̅-μ_{0})/(s⁄ √n) with df=n-1

t*=b_{1}/s{b}

**Step 3**: t {1- α, n-1} OR t {1- α/2, n-1}

**Step 4**: If t ≥ +crit val or ≤ -crit val reject H_{0}

What is the simple linear regression model?

Y=β_{0}+β_{1}X_{1}+ε

In linear regression, E(ε)=

0

In linear regression, σ^{2} {ε}=

σ^{2}

In linear regression, ε’s __are/are not__ correlated and have covariance of ___.

ε’s are uncorrelated and have covariance of 0

Least Squares Estimates of betas _____ the sum

n

∑ [y_{1}-(β_{0}-β_{1}x_{i})]^{2}

(i=1)

minimize

Interpretation of β_{1}

Y=β_{0}+β_{1}X_{1}+ε

For each increase in x, there is an increase/decrease in y.

(e.g., For each add’l hour a student watches tv, he loses .2 GPA points)

Interpretation of β_{0}

Y=β_{0}+β_{1}X_{1}+ε

The mean when x=0

(e.g., On average, first year students who don’t watch tv have a GPA of 3.9)

y ̂ is the ____ regression line.

estimated

b_{1} and b_{0} are estimates for

β_{1} and β_{0}

What is the equation for b_{1}

(ss_{xy})/(ss_{xx})

What is the equation for b_{0}

y ̅ - b_{1}x ̅

What is the equation for SS_{xx}

All of the following equations are equal

∑(x_{i }- x ̅ )^{2}

(∑x_{ }_{i}^{2}) - n(x ̅ )^{2}

(n-1) s_{x}^{2}

SS_{xx} must be __positive/negative__.

positive

What is the equation for SS_{xy}

∑(x_{i} - x ̅ ) (y - y ̅ )

(∑x_{i}y_{i}) - n(xy̅)

When creating a table for an estimated regression line, which 5 columns should you include?

x_{i} | y_{i } | x_{i}^{2 } | y_{i}^{2} | x_{i}y_{i}

What is the equation for the residual ε_{i}

ε_{i} = y_{i}-E(y_{i})

What is the equation for the residual e_{i}

e_{i} = y_{i} - y ̂_{i}

s^{2} is the ________

sample variance

What is the equation for s^{2}

All of the equations below are equal

(∑(x_{i}-x ̅ )^{2}) / (n-1)

SSE/(n-2)

MSE

s is the ____________

sample standard deviation

What is the equation for s

√MSE

√(SSE/(n-2))

SSE is

The sum of the squared errors

What is the equation for SSE

All of the equations below are equal

∑e_{i}^{2}

∑(y_{i} - y ̂_{i})^{2}

ss_{yy} - b_{1}^{2}ss_{xx}

What does s^{2}=.045 and s= .212 mean?

If the dist of GPA for ppl who watch x hrs of tv is approx. normal, then about 95% of them are expected to have GPAs within 2(.212) units of their simple linear reg model

You should assume ____ for hypothesis testing and confidence intervals

normality

b_{1} and b_{0} are _______ for β_{1} and β_{0}

least squares estimators

Why do you want to have a large range of data?

The more variation you have, the better estimate of the slope you can get..

sampling distribution of __(b_{1})_need to check this_?

has a t-distribution of n-2,

because we estimate b_{0} and b_{1}

What does it mean to have a 95% CI?

If we took 100 samples of size xx, we would expect 95% of tem to contain value β_{1}

Interpretation: 95% of all b_{1}’s will fall within this range

What is Interval Estimation?

CI for mean of Y when x=x_{h}

SSTo

the error/variation when not using any model at all; never changes when using a diff model or using new variables; total var around y ̅

SSE

error/variation when using SLR; the variation in y not explained by using x; too high equals too much error

SSR

The error left after fitting the model; the chunk of variation in y explained by using x (we want this to be large)

What are the components of the ANOVA table?

__Source of Variation SS df MS__

Regression SSR ÷ 1 = MSR

__Error SSE ____÷____ n-2 = MSE__

Total SSTo n-1

What does an F-test for model usefulness tell us

if R^{2} is signif, but not if it is useful

What are the four steps in conducting an F-test

**Step 1**:

__two-sided__: H_{0}:β_{1}=0 H_{a}: β_{1} ≠ 0

**Step 2**: F*=MSR/MSE =SSR/MSE (all are always positive; want F* to be >1)

**Step 3**: F {1-α, 1, n-2} (*numerator df always 1 in SLR)

**Step 4**: if F* > F {1-α, 1, n-2}, we reject H_{0}and we have evidence that the SLR model is useful

*****in SLR (one predictor variable), the t-test for β_{1}=0 is the same as the F-test

*****In SLR only F* = t*^{2} √(f_{crit}) = t_{crit}