Regression Flashcards

Question

Are the OLS estimators consistent?

Answer 1

Probability of the difference between true paraemter value and estimated parameter value being larger than some constant will be 0 as the number of data points approach infinity. thus, consistency is about converging to true population values. Unbiasedness is just about average. Consistency say something about convergence. Consistency is an asymptotic property.

Answer 2

Best. refers to the fact that the estimator is the most efficient estimator.

Answer 3

Every lin reg is subject to the specific sample that is used. If sample change, results change. Therefore, it is useful to obtain an understanding of how good our measures are. meaning, how reliable are our parameters. The paremeters are subject to sampling variability. Therefore, we want ot quantify the sampling variability. We use standard error of the parameters to obtain this information. Standard error is the standard deviation of the estimator. It tells us how much we can expect the estimator to vary between different samples. It is found by computnig the variance of the OLS parameters (Var(alpha)) etc. Small are desirable.

Answer 4

it tells us nothing about how good the estimates are. It only tells us how much we can expect the estimated values to change from different samples. It is a measure of accuracy that does not relate to the model accuracy itself.

Answer 5

Sample standard deviation of the regression model itself. It tells us how close the predictions will be to the actual data. However, again, it doesnt necessarily say that the accuracy of the estimators are perfect or not.

Answer 6

1) drop with larger sample size 2) Increase with greater variance in residuals, because SE(a) and SE(b) depends on S^2, which is the variance of hte regression

Answer 7

We'd like to use normal, or standard normal. however, since the standard errors of the parameters are not known, as they cannot be observed, we need to use estimators. This requires us to use student t distribution instead.

Answer 8

When representing data, we treat it as a variable as well. However, this variable will just have 1s as the values.

Answer 9

k is the number of parameters that needs to be estimated. is therefore equal to the number of variables including the constant

Answer 10

allows for multiple hypothesis at once. Two regressions are required: 1) Restricted 2) Unrestricted In regression, we get the statistic: F = (RRSS - URSS)/URSS x (T-k)/m The entire idea is that if the difference between the residual sum of squares is not large for the regression that has no restrictions and the regression that has restrictions, then the restrictions are backed by the data. If not, they are not backed by the data. This allows us to place constraints on the parameters, can set them to be anything. For instance, can set some of them equal to 0.

Answer 11

Say we have b_1 + b_2 = 1 Then we can use "b_1 = 1 - b_2" and add this in place for b_1 in the regression. Then we gather all the b_i's on one side. This can mean that we need to move a variable (independent) to the other side, like this: y-x_4 = ... we create new variables, P=y-x_4 etc

Answer 12

Takes on either 0 or 1 Used to model qualitative variables into quantitative

Answer 13

dummy variable that works by changing the intercept, by shifting the regression up or down.

Answer 14

If we use dummies to encode categorical variable that has more than 2 options, we enter a trap if the sum of the variables are always 1. if this is the case, we get fucked up results. we need to have the possibility of the sum being equal to 0.

Answer 15

We want a way to say sometihng about "how well the model is able to explain deviations in the explained variable about its mean". Consider if we simply take the mean of the data set, and call it a regression model. Then we measure the residual sum of squares. Then we fit the actual model, and do the same. We can think about the added precision of the model as explaining variance about the mean. Note that this is not about how well the model generalize. It is about how well the model is able to explain the variation in the sample. The most common one is R^2, and it is given by: R^2 = ESS/TSS, or equivalently TSS-RSS/TSS = TSS/TSS - RSS/TSS = 1 - RSS/TSS

Answer 16

Explained sum of squares. It is a part of TSS. TSS = ∑(pred(y_t) - mean(y))^2 + ∑(u_t)^2 ESS is therefore a value that tells us how much deviation there is between a sample point and the mean sample point.

Answer 17

The model has not explained any of the variance about its mean. ESS = 0.

Answer 18

if the best fitting line is the mean, then we get R^2 = 0, even though the line may be very good. It is not sensible to compare R^2 values of models that have even just slightly different dependent variables. R^2 never falls when adding more independent variables.

Answer 19

Adjusted R^2 to account for penality of more regressors.

Answer 20

The core idea is that one can represent total value by the sum of its individual parts. this gives rise to the possibility of modeling a price by simply adding properties. Typically used in real estate properties.

Regression Flashcards

(46 cards)