Chapter 5 - Brooks Flashcards
(31 cards)
what is the purpose of diagnostic tests?
To see whether the CLRM holds as a valid tool
what is the foundation of this chapter+
The two types of ways to derive all the tests we are working with.
1) LM approach
2) Wald approach
There is a third approahc, LR /likelihood ratio).
Distinguish LM and Wald test
LM has chi-squared variable.
Wald has F-distributed statistic.
elaborate on LM test
follow chi squared variable with degrees of freedom equal to the number of restrctions placed on the restricted regression.
elaborate on the Wald test
F-dist, with F(m, T-k) degrees of freedom
what is the relationship between LM and Wald
Asymptotically, they are the same. This is because when the sample size grows, the “T-k” degrees of freedom grow towards infinity.
This is becasue of the behavior of the chi squared variable. By definition, it is a sum of standard normal distributed variables (squared). Because they are squared, the expectation grows with the size of the lenght of the sum. The larger the sum, meaning the more degrees of freedom there are, the larger the mean value will be. Recall that the mean is k while variance is 2k.
Anyways, more degrees of freedom implies that the variable value grows. When we consider the F-distribution with denominator that has degrees of freedom that grow with the sample size, this implies that as the sample size grows, the denominator decrease.
Outcome is that asymptocialyl, the F-dist is a chi squared variable.
elaborate on the first assumption
E[u_t] = 0
Expected value of residuals is 0.
The only thing that guarantee this is the inclusion of the intercept term. If we do not have this included, we cannot guarantee that the average error is 0.
In fact, we could get negative R^2 values if we do not have a variable intercept.
elaborate on the second assumption
The second assumption is about the variance of the residuals being constant.
This is commonly referred to as the assumption of homoskedasticity.
what do we say if residuals are not constant in variance
The errors are heteroskedastic
how do we detect heteroskedaticiry
we consider 2 primary tests.
1) Goldfeld Quandt
2) White’s test
elaborate on GQ
We split the sample into two subsamples.
The regression model is estimated on each subsample, and then the residual VARIANCES is found for both.
We do not need the regression on the full sample.
The null hypothesis is that variance is constant, so we have sigma^2_1 = sigma^2_2.
The test statistic is the ratio between those two values. This is therefore F(T1-k, T2-k).
Very simple test to conduct, but it is sensitive to the point of break.
how can goldfeld quandt be improved?
increase the size of the mid-region dropout. This will increase the power of the test.
Recall that the power of the test is the probability of making a type 2 error (reject the null hypothesis when it is actually true). Therefore, making the mid-region dropout larger will make it less likely that we say that some value is insignficiant when it is actually true.
alternative of GQ test
White’s test
elaborate on white’s test
it has the advantage over GQ in the sense that it makes no assumption about the SHAPE of the heteroskedasticity.
This test relies in trying to figure out if there exists a systematic variation in the squared residuals or not.
We make a regular regression, obtain the residuals, and then we square the residuals and fit a NEW regression that has both linear regular coefficients, but also squares the variables etc. We include cross products of the variables as well.
This is based on how variance of residuals reduce to E[u_t^2] because the E[u_t]^2 part is 0 under the validity of the first assumption.
Now we have 2 approaches for the remaining part.
We can basically choose a framework, chi squared or F-dist.
If we use F-dist, we need to find 2 regressions.
We would use the auxillary regression for both, meaning that we try to regress on the squared residuals of the OG regression for both. however, the unrestricted one is the one that has all thhe regular terms, squared terms, cross terms. The restricted one has only a constant. Then we use the hyp that all params are 0, which basically mean that there is no systematic variation that explains the variance.
The LM approach is based on R^2 etc.
what happens if we use CLRM on a model where assumption 2 is violated?
1) The estimator will remain unbiased. Think of what for instance increasing variance looks like on a plot.
2) The estimator is no longer “best” linear. They no longer have the best efficiency.
3) the standard errors can be wrong. This is becasue the standard errors depend on the variance of the data. With wrong standard errors, we dont really know what we get.
how can we deal with heteroskedasticity?
One could try to transform into logs.
elaborate on detecting the third assumption
The third assumption is autocorrelation.
We use the Durbin Watson test
elaborate on the Durbin Watson test
the null hypothesis is that the correlation between a residual and its lag-1 residual is 0. If the value we observe is extreme, we reject the null and say that there is likely autocorrelation here.
downside of DW
only checks for lag-1
how can we check for lag-l autocorrelation
use Breusch Godfrey
elaborate on Breusch Godfrey
We regress the regular regression to find the residuals.
Then we create a new regression where the dependent variable is the residuals we just found, and as explanator variables we use the OG ones AND lagged variables up to some order k of our choice.
The R^2 of the new regression when multiplied by (T-r) (r is the k’th order we chose) follow a chi squared distribution with r degrees of freedom.
what happens if assumption 4 is violated
Estimaror will not even be consistent
how do we test for normality of residuals
Bj test, Bera Jarque
elaborate on BJ test
based on checking whether the skewness and kurtosis holds up.