OLS Flashcards
(20 cards)
Linear Regression Model (LRM) assumptions:
1: y=Xβ+ε (Linearity)
2: E(ε|X)=0 (exogeneity, regressors contain no information on the derivation of Yi from its conditional expectation)
3: Var(ε|X)=σ^2In (homoskedasticity)
4: rank(X)=rank(X’)=rank(XX’)=rank(X’X)=k
5: ε|X~N(0,σ^2In)
6: {(Yi,Xi):i=1,…,n} are independent and identically distributed
When is assumption 4 (full rank) not satisfied?
When n<K or when there is an exact relationship among any of the columns of X (perfect multicollinearity)
How to use LIE to show that E(ε|X)=0 implies that E(εX)=0?
E(εixi)=E(E(εixi|xi))=E(xiE(ε|xi))=E(0)=0
How to estimate β in OLS approximation?
Choose the β^ that minimizes the sum of squared residuals: S(β^)=(y-Xβ^)’(y-Xβ^). Minimizing the squared residuals by solving S’(β^)=0 leads to X’Xβ^=X’y –> β^OLS=(X’X)^-1X’y
Form of projection matrix P and form of projection matrix M:
P=X(X’X)^-1X’
M=I-P
Properties of P and M;
1: P=P’, M=M’ (Symmetry)
2: P’P=P, M’M=M (Idempotent)
3: MP’=(I-P)P’=0 (M and P are orthogonal to eachother)
4: Py=X’(X’X)^-1X’y=y^
5: My=(I-P)y=y-Py=y-y^=ε^ (M is the residual maker
6: Mβ^=0 and Mε^=ε^
7:PX=X
8:y=Iy=Py+My, so M2X1 are the residuals from the regression of X1
9:M2X1 is orthogonal to X2 (X2’M2X1=0)
Frisch and Waugh, Lovell Theorem (FWL):
Says that β1 gives the effect of X1 on y by controlling the effect on X2; M2 enters the formula of β1. Note that the effect of X2 an be ignored since M2X1 is orthogonal to X2, controlling for other factors when interpreting a coefficient. We obtain: β1OLS^=(X1’M2X1)^-1X1’M2y
How to prove that β^ is an unbiased estimator of β? (E(β^)=β)
β^=(X’X)^-1X’y=(X’X)^-1X’(Xβ+ε)=(X’X)^-1X’Xβ+(X’X)^-1X’ε=β+((X’X)^-1X’ε
Then E(β^|X)=E(β|X)+E((X’X)^-1X’ε|X)=β+(X’X)^-1X’E(ε|X)=β
Then by LIE: E(β^|X)=E(E(β^|X))=E(β)=β
Is s^2=(Σ(Xi-X̄)^2/(n-1)) an unbiased estimator for σ^2?
Yes. Use E(X^2)=σ^2+μ^2, E(X̄^2)=(σ^2/n)+μ^2, Var(X̄)=E(X̄^2)-(E(X̄))^2 and then solving E(Σ(Xi-X̄)^2)=…=(n-1)σ^2 and then E(Σ(Xi-X̄)^2/(n-1))=σ^2
β^ is the best unbiased estimator if the following criterion holds:
Var(β0^|X)≥Var(β^|X)
Gauss-Markov theorem:
In the LRM with regressor matrix X, the OLS matrix β^ is the best (minimum variance), linear, unbiased estimator of β (BLUE)
β^ is multivariate normal when:
it is a linear combination of normal random variables. Then: β^|X~N(β,σ^2(X’X)^-1)
β^ is consistent if:
β^–>β (in probability):
β^ is asymptotically efficient if:
among all consistent estimators the asymptotic variance of β^≤asymptotic variance of any other estimator. Limiting distribution: √n(β^-β)–>N(0,σ^2(E(xi’xi))^-1)
Test for checking whether H0: β^=β0 or H1:β^≠β0
T=(β^-β0)/(s.e.(β^)) and critical value t_(1-α/2;(n-k)). Reject H0 if |T|>t_(1-α/2;(n-k))
If we want to test multiple J linear restrictions on the true coefficients vector (Note: rank(R)=J); H0: Rβ-q=0 vs H1: Rβ-q≠0:
F=((Rβ^-q)’(Rs^2(X’X)^-1R’)^-1(Rβ^-q)’)/J and reject H0 if F>F_(1-α;J,n-k)
Or: F=((RSS_r-RSS)/q)/(RSS/(n-k)), with RSS_r the restricted residual sum of squares and RSS=ε’ε is the unrestricted residual sum of squares
Asymptotic normality:
√n(β^-β)–>N(0,(EX1X1’)^-1Eε1^2X1X1’(EX1X1’)^-1) Or N(0,Q^-1ΩQ^-1)
Note: Eε1^2X1X1’=E(E(ε1^2|X1)X1X1’)=σ^2EXiXi’
Asymptotic Confidence Intervals:
CI_(1-α,n)^j=[β_(n,j)^±z_(1-α/2)√([Vn^]jj/n)]
In case of homoskedastic errors:
CI(1-α,n)^j=[β_(n,j)^±z_(1-α/2)√(sn^2[(X’X)^-1]_jj)]
Power of a test:
A test is said to have good power if the probability of rejecting H0, when it is false, is high. H1 is true if you are on the left or right hand side of the t-distribution and this happens in three situations:
1: If the effect size of βk is large, then H1 is more likely. Large βk–> large t-value
2: When the sample size is large–>variance of the error is smaller–>larger t-value
3: If α is larger, H1 is more likely, because you are further on the left or on the right of the t-distribution.