Chapter 4 - Brooks Flashcards
(26 cards)
in regards to fiancne and econometrics, name a reason why we would conxider it beneficial to extend the bivariate CLRM to more variables
no arbitrage theory opens up for possibility of multiple variables affecting the return
what is the difference in interpretation of the coefficients in the multivariate CLRM vs bivariate
Now they are called “partial regression coefficeints”, because they are only a part in the ultimate explanation of variability in the dependent variable.
when we say that we have k variables, what does it include?
All variables, including the intercept term
how do we derive the OLS estimator?
The OG way is to minimize the sum of squared errors. The sum of squared error is a convex function, so we know that if we differentiate it and solve for 0, we get the error-minizing result.
However, MLE and method of moments also work and yield the same answer.
elaborate deeply on method of moments
Create a system of equations, one equation per unknown, and one unknwon is a parameter of the distribution we are working with.
Each equation is found by relating the theoretical k’th moment with the sample k’th moment.
Solve the system.
what can we do to create a broader testing framework that t-ratio+
due to the downsides of t-ratio being only a single parameter at a time, we want a test that can test multiple hypothesdis at once.
This is nice, because it allows for broader class of restrictions, testing for things like “this and this and this” as a combination.
The outcome if the F-test.
elaborate on the F-test
Requires 2 regressions.
1) Unrestricted
2) Restricted
The restricted regression requires that we enter the hypothesis we want to test for. In essence, this is then going to make the model more limiting.
We find the residual sum of squares from each regression. Then we do this:
test statistic = (RRSS - URSS)/(URSS) x (T - k)/m
This is f-distributed.
What is “k” in the F-test?
explaantory variables number
what is “m” in F-test?
The number of restrictions for the restricted regression.
What is “T” in the F-test?
number of data points
why does the F-test work?
the enumerator, RRSS - URSS, represent a difference. If the restrictions we added are insignificant, this difference is equal to 0.
If the difference is large, meaning that the restrictions have an impact, we get a value that means “This value is not likely to observe given that the regressions are equal”. Thus, large values indicate that the parameters we placed restrictions on are actually important, and that the restrictions should not be used.
A restriction that makes sense is to place all coefficients to be 0. This will test quickly whether the regression has any idea or not.
So, one assumes that the variable “RRSS-URSS” is a chi squared variable. RRSS has “T-k-m” degrees of freedom. URSS has “T-k” degrees of freedom. Thus, the differnce is “m”. Therefore this chi-squared variable gets “m” degrees of freedom.
The URSS still has “T-k” degrees of freedom.
What is the F-statistic actually?
F-distributed random variable is defined as the ratio of two chi-squared variables along with their degrees of freedom:
X = (X_1/df1)/(X_2/df2)
There is little exciting shit about the F-statistic other than its uses with testing an ANOVA
discuss the “issue” with the F-test
It is not an issue, it is just sometihng to be aware of. It is a one-sided test, so we only care is the statistic exceeds a certain limit.
what relationships cannot be tested with either t-ratio or F-test?
non linear relations, like b1 b2 = 4 etc
what is size of the test?
Alpha, the probability of performing type 1 error, the probability of rejecting the null hypothesis even though it is valid.
what is the issue surrounding the size of the test?
By the law of large number or whatever, if we try enough shit, we will eventually (by sheer luck) find significant relationships. It is not that they are singificant, it is that they appear as significant.
The issue with this is best seen when we use a large number of regressors. say we have 20 regressors, and the size of the test is 5%. Then we should expect one of the regressors to appear significant just because of how data is distributed.
how to solve the issue of data snooping?
Use a test set, out of sample test set, to get a measure of the performance.
This may be difficult in cases where we are limited in data.
elaborate on dummy variables
used for qualitative values.
Typically only binary values.
Be aware of hte trap.
elaborate on the traps with dummy variables
There are several.
One is the case where one try to model relationships that are not interval attributes as integer valued. This creates a fucked up model. For instance, using an integer variable to represent “location”. One should instead using binary variables.
THe other is “the trap” that entails fucking up and making the matrix unsolvable. This is done if we get a system of equations that is linearly dependent in its columns.
The idea to avoid the trap is to not include exhaustive amount of variables.
We NEED each group to have one free spot if we want ot include the intercept.
if we remove the intercept, we can include all dummys. but then we might lose some info.
what is the point about R^2?
goodness of fit. R^2 is a TYPE of goodness of fit statistics.
The point is to get an understanding of how well the model performs.
Specifically, R^2 answers “how well is the model able to explain deviations from the mean level”.
elaborate on R^2
R^2 gives a number of how much variance the model is able to explain. We therefore relate the values of “total sum of squares” with “residual sum of squares”/”unexplained sum of squares” and “explained sum of squares”.
The “total sum of squares”, TSS, is the sum of squared differneces between data points and the constant mean level.
The ESS, explained sum of squares, is the sum of squared differences between the constant mean level and the SRF line that we have predicted.
The RSS represent the remaining sum of squares. The smaller this value is, the better.
The ratio of ESS/TSS represent how much of the total variation that has been captured by the model.
R^2 = ESS/TSS
We can also write it as:
R^2 = (TSS - RSS)/TSS
R^2 = TSS/TSS - RSS/TSS
R^2 = 1 - RSS/TSS
elaborate on the issues with R^2
1) it favors more regressors as there are no penalities for adding more regressors
2) R^2 depends heavily on the dependent variable. Therefore, if the model is slightly re-parameterized, it makes no sense to use R^2 for comparison.
3) sucks for time series models
elaborate on R^2 as dependent on the dependent variable
We must be careful if we re-parameterize.
If the reparameterization scale the function so that the y_values are placed relatively speaking at different positions compared to each other, then we cannot use R^2 to compare fits. For instance, if we log-transform it, we cannot use R^2.
if we divide by “n”, we cannot use it.
We can shift it
Elaborate on overcoming the issue of “more regressors is better”
using adjusted R^2.
Adds a penalty for more regressors.