simple linear regresion Flashcards
dependent variable - y
the variable we are seeking to explain
independent variable - x
the explenatory variable -
linear ligretion
assumes a linear relationship between the DV and IV
variation of y
SSE = suma(yi-yśr)^2
to test how x explain y is we take SST = suma(yi-yh)^2
yh - odległość od lini regresji
if SSE = SST there is no diffrence
if SSE>SST x can be more described by y with the line
yi = b0+b1*xi +e
b0 = intercept
b1 = slope coeficient = cov(x,y)/variance(x)
both known as regression coeficient
least squares method
residuals = distance from the reg line
best fit line = minimizes the sum of the squares deviations between the observed values of y and the predicted value of y
intercept (interpretation)
yśr=b0 if xi=o
bi = the change in y for a one unit change in x
linearity
the relationship between x and y is linear in the parameters. b0 and b1 cant be multiplied or divided by any regression parameter
homoskedacticity
variance of (error) is the same for all observations if there is more than one it is a violation
independence
pair of x and y should be independent. we should not be able to predict
normality
error is normally distributed
analysis of variance
- SST=SSE+SSR
- sum(yi-yśr)^2 = sum(yi-yh)^2+sum(yh-yśr)^2
- total sum of squares = sum of squared errors(unexlained) = regression sum of squares (explained)
coefficient of determination
measure of fit (not stat test) (R^2)
R^2 = SSR/SST = expl var/total var
how much variation of the independent value is being explained by the variation of the dependent value
F-test of coeficient
f = MSR/MSE = var1/var2
F = SSR/k/SSE/n-(k+1)
k=slope coefficient
k+1 = regression coefficient
Standard error of the estimate
SEE = sum(yi-yh)^2/(n-2))^1/2 = MSE^1/2
the lower the see the more accurate the regression
standard dev of the error term
hypothesis test of b1
t = (bh-b1)/Sbh
Sbh = standard error of b1 (mianownik to SEE/pierw(suma(xi-xSr)^2
h0 = b1=0
h1 = b1=/0
bh = we should have