Midterm Flashcards
(90 cards)
u
the variation in yi that is not captured or explained by xi → this could include both unsystematic predictors of yi (e.g., a job application randomly landing on the top or bottom of a stack of other applications) and systematic determinants of yi (e.g., years of prior experience) that are omitted from the model
PRF vs. SRF
PRF: E(yi) =β0+β1xi
SRF: yˆi = βˆ0 + βˆ1xi
Errors and residuals for SRF
Notice that there is no estimate of uˆi for the SRF because yˆi, by definition, is the regression line (same logic applies to the PRF)
The estimates of the errors, which are called the residuals, are the differences between observed values of yi and the predicted values yˆi :
uˆ i = y i − yˆ i = y i − βˆ 0 − βˆ 1 x i
OLS
OLS is the most commonly used estimator in the social sciences (to find beta)
OLS will be our workhorse estimator in this course
OLS obtains estimates of the “true” population parameters β0 and β1, which we typically do not observe
The logic of the OLS estimation procedure: choose βˆ0 and βˆ1 that minimize the sum of squared residuals: uˆ2
Why minimize the sum of squared residuals, instead of the sum of residuals or the absolute value of residuals?
If we use the sum of residuals, then residuals with different signs but similar magnitudes will cancel each other out
Minimizing sum of absolute values is a viable alternative but does not generate formulas for the resulting estimators
Relationship between PRF and SRF through residuals
y i = yˆ i + uˆ i
SST
Total Sum of Squares (SST): Measure of sample variation in y
SSE
Explained Sum of Squares (SSE): Measure of the part of y
explained by x
SSR
Residual Sum of Squares (SSR): Part of the variation in y unexplained by x
R^2 and magnitude of relationship between y and x
As a measure of correlation, R2 should not be confused with the magnitude of the relationship between a DV and IV
You can have a bivariate relationship that has a high R2 (i.e., high correlation), but that has a slope that is close to 0
You can also have a bivariate relationship with a low R2 (i.e., low correlation), but that has a slope that is high in magnitude
What changes when you transform a regressor?
Bottom line: if we transform a regressor, then only the slope coefficient for that regressor is transformed
What happens if the relationship between wage and education is non-linear?
These patterns can be nicely modeled by re-defining the dependent and/or independent variables as natural logarithms
The linear regression model must be linear in the parameters, but not necessarily linear in the variables, so logging Y or X shouldn’t violate our requirement of a linear relationships between the dependent variables and its determinants
Linear regression models?
y = β0 + β1x + u log(y) = β0 + β1x + u log(y)= β0+β1log(x)+u y = log(β0+β1x+u) e^y = β0+β1 √x+u y = β0+ (β1x1)/(1 + β2x2) + u
y = β0 + β1x + u Yes
log(y) = β0 + β1x + u Yes
log(y)=β0+β1log(x)+u Yes
y =log(β0+β1x+u) Yes
e^y =β0+β1 √x+u Yes
y=β0+ (β1x1)/(1 + β2x2) + u No
1 If we exponentiate both sides of this equation, we get: e^y = β0 + β1x + u, which is linear in the parameters
wage= β0 + β1educ + u “ “
1 additional year of education is associated with an increase in wages
of β1 units
wage= β0 + β1log(educ) + u “ “
1% increase in education is associated with an increase in wages of β1/100 units
decreasing returns
log(wage)= β0 + β1educ + u
1 additional year of education is associated with a (100 ∗ β1 )% increase in wages
increasing returns
log(wage)= β0 + β1log(educ) + u
1% increase in education is associated with a β1% increase in wages
Assumption 1
Linearity in the parameters
the population model can be non-linear in the variables but must be linear in the parameters
Assumption 2
Random Sampling
individual observations are identically and independently distributed (i.e., observations are randomly selected from a population such that each observation has the same probability of being selected, independent of which other observations were selected.)
Assumption 3
Sample Variation in the explanatory Variable
the sample standard deviation in xi must be greater than 0 (need some variance in order to get an estimate)
Assumption 4
Zero Conditional Mean
E(u | X) = 0
if it holds, then the error term u is uncorrelated with the regressor X
this assumption is usually the biggest area of concern in empirical analysis
Assumption 5
Homokedasticiy assumption
Var (u | X) = sigma^2
the variance of the unobservable error term, conditional on x, is assumed to be constant
Var(u) is independent of x
If assumptions 1-4 hold….
the OLS estimator is unbiased, meaning that on average, E(beta-hat) = beta
If assumptions 1-5 hold
if 1 - 5 hold, then we can derive a formula for the variance of the coefficient estimates, Var(beta-1)