statistics - topic 7 - bivariate regression Flashcards
(34 cards)
what is regression analysis used to do ?
Explain the impact of changes in an independent variable on a dependent variable
Predict the value of a dependent variable based on the value of at least one independent variable
what is the dependent variable?
The dependent variable is the variable we wish to explain (also called the endogenous variable)
what is the independent variable?
The independent variable is the variable used to explain the dependent variable (also called the exogenous or explanatory variable)
what does the population regression model show?
it shows the relationship between two variables
what are the components of a population regression model?
it has a dependent variable, population intercept, population slope coefficient, independent variable and an error term
what is the sample data used for ?
Sample data is used to provide an estimate of the population regression model
what are the assumptions required for the least squares estimation to be an accurate estimate?
The true relationship is linear (π is a linear function of π, plus a random error)
The error term, π_π, is uncorrelated with the random variable, π
The error term, π_π, has a mean of 0 and constant variance, π^2 (the latter property is called homoscedasticity):
πΈ[π_π ]=0 and πΈ[π_π^2 ]=π^2 for π=1,β¦,π
The error terms, π_π, are not correlated with one another, so that:
πΈ[π_π π_π ]=0 for all πβ π
what is the least squares method?
Least squares provides estimates of π½_0 and π½_1 by finding the values of π_0 and π_1 that minimize the sum of the squared errors (SSE):
minβ‘πππΈ=minβ‘β(π=1)^πβπ_π^2 =minβ‘β(π=1)^πβ(π¦_πβπ¦Μπ )^2 =minβ‘β(π=1)^πβ[π¦_πβ(π_0+π_1 π₯_π )]^2
what is the equation for b1 in the least squares coefficient estimator?
π_1
=(β(π₯_πβπ₯Μ
)(π¦_πβπ¦Μ
) ) /(β(π₯_πβπ₯Μ
)^2 )
=πΆππ£(π₯,π¦)/(π _π₯^2 )
=π x (π _π¦/π _π₯ )
where π is πΆπππ(π₯,π¦)
what is the regression line after you have estimated b1?
π_0=π¦Μ
βπ_1 π₯Μ
because the regression line goes through the sample means π₯Μ
, π¦Μ
what are the two parts of the variation in a dependent ratio?
the total sum of the squares is eqal to the regression sum of the squares + the error sum of the squares
what is the formula for the regression sum of the squares?
β(π¦Μ_πβπ¦Μ )^2 where π¦Μ_π = predicted value of the dependent variable given π=π₯_π and π¦Μ = sample mean of the dependent variable
what is the formula for the error sum of the squares?
β(π¦_πβπ¦Μ_π )^2 where π¦_π = observed value of the dependent variable and π¦Μ_π = predicted value of the dependent variable given π=π₯_π
what is the coefficient of determination?
it is the proportion of the total variation in the depenedent variable that is explained by variation in the independent variable
what is the formula for the coefficient of determination?
π ^2=πππ /πππ=(ππππππ π πππ π π’π ππ π ππ’ππππ )/(π‘ππ‘ππ π π’π ππ π ππ’ππππ )
in what range does the coefficent of determination live in and what is it equal to?
0β€π ^2β€1 and π ^2=π^2 where π denotes the correlation coefficient
what does it mean when R^2 is equal to 1?
When π ^2=1, there is a perfect linear relationship between π and π: 100% of the variation in π is explained by variation in π
what does it mean when R^2 is equal to 0?
When π ^2=0, there is no linear relationship between π and π: the value of π does not depend on π
what is the standard deviation of e_i and why is this the case?
The standard deviation of π_π is:
π _π=β((β(π_π^2 )/(πβ2))=β(πππΈ/(πβ2))
Division is by πβ2 instead of πβ1 because the estimated regression model contains two estimated coefficients, π_0 and π_1
what is the standard deviation for b1?
The standard deviation of π_1 is:
π _(π_1 )=β((π _π^2) / (β(π₯_πβπ₯Μ
)^2 ))
what does the standard deviation of b1 show?
It is a measure of variation in the slope of regression lines from different samples
what is the test statistic for a hypothesis test of a slope?
π‘=π_1/π _(π_1 )
what is the decision rule for the hypothesis test of a slope?
Reject π»_0 if π‘< βπ‘_(πβ2, πΌβ2) or π‘>π‘_(πβ2,πΌβ2)
what is the formula for the confidence interval of a slope?
π_1Β±π‘_(πβ2,πΌβ2) π _(π_1 )