# 2nd Midterm Class Notes Flashcards

1
Q

Interaction Terms

A

If the effect of Xi on Y depends on the value of X2, you should include the interaction of X1 and X2 as an explanatory variable.

2
Q

Some notes on interaction terms

A
1. ) You should never drop Xk as an explanatory variable, even if it is insignificant, if your model includes interaction terms involving Xk.
2. ) Be careful interpreting Bk when you model includes interaction terms involving Xk.
3
Q

In regards to non-linear terms…

A
1. ) You should never drop Xk as an explanatory variable, even if it is insignificant, if you model includes higher-order terms involving Xk.
2. ) Be careful in interpreting Bk when your model includes higher-order terms involving Xk.
- A one-unit increase in X2 would not increase Y by B2. Rather, a one-unit increase in X2 would increase Y by B2 + 2B3 + 2B3X2 (This marginal effect can be found by taking the derivative of Y with respect to X2)
4
Q

If a categorical variable includes C categories, you can…

A

include C-1 dummy variables in your model

5
Q

Measurement Error of Xk

A
1. ) What is the nature of the problem?
a. ) Observed/measured Xk = True Xk + e
2. ) What are the consequences of the problem?
a. ) Bk will be biased towards zero because the poorer the measurement of Xk, the less information we have to find a relation between Y and Xk, the flatter our regression line.
3. ) How is the problem diagnosed?
a. ) Typically via a theoretical understanding of the process by which Xk is measured and thinking about whether that measurement is precise. Often, Xk is included as a proxy measure for something that is harder to measure.
4. ) What remedies for the problem are available?
a. ) Get better measurement of Xk.
- Find an instrumental variable for Xk
- Be aware of bias.
6
Q

What is the nature of the problem with imperfect multicollinearity?

A

Xk is highly predicted by other variables in the model (i.e. Rk2 is high)

7
Q

How is the problem of multicolinearity diagnosed?

A
• As a simple check for potential multicollinearity problems, first test the correlation of your x variables
• Or as a second test, if both of the two correlated variables lack significance in the regression, test the joint significance using an f-test. If the F-test shows that one of the two variables is significant, you probably have a .multicollinearity problem.
• The most sophisticated and best way to test for this is to compute the “Variance Inflation Factor” VIF
8
Q

What are the remedies for multicollinearity?

A
• Only include one of the two correlated variables in your model.
• Get more observations because multicollinearity is less of a problem in large samples
9
Q

Non-Linear Models are a failure of ?

A

classical assumption 1

10
Q

What is the nature of the non-linear model problem?

A

Relationship btw Xk and Y is non-linear.

11
Q

What are the consequences of the non-linear model problem?

A

One could improve fit of Y-hat to Y by including non-linear terms (e.g., square of Xk, ln(Xk), etc.) or by transforming the dependent variable (e.g., ln(Y)). Failure to do so could produce heteroskedastic errors

12
Q

How is the non-linear model problem diagnosed?

A
1. ) Theory
2. ) Scatterplots of Xk versus Y. Look for relationships that are non-linear
3. ) Scatterplots of the error terms versus Y. Look for relationships that are non-linear.
13
Q

What remedies are there for non-linear models?

A

Include non-linear terms or transform the dependent variable.

14
Q

When explanatory variables are correlated with the error term this is a failure of…

A

classical assumption 3

15
Q

What is the nature of the problem with explanatory variables that are correlated with the error term?

A

XK is correlated with e.

16
Q

What are the consequences of explanatory variables that are correlated with the error term?

A

Bk will be biased by the omission of Z (i.e. “omitted variable bias”)

17
Q

How is the problem of explanatory variables that are correlated with the error term diagnosed?

A

Theory

18
Q

What remedies for explanatory variables that are correlated with the error term are available?

A
• Include Z in the regression

- Instrumental Variables

19
Q

Instrumental variables are used in what two cases?

A

a. ) Xk is measured error.

b. ) An unobserved (and thus omitted) variable affects both Xk and Y, and thus biases Bk

20
Q

An instrumental variable (Q) has the following two properties

A

a. ) Q is correlated with Xk

b. ) Q has no effect on Y other than through its effect on Xk. That is, Q has no direct effect on Y.

21
Q

How is the Instrumental Variable used? “two stage least squares” (just for if we read papers using this technique)

A

Step 1: Regress Xk on Q and all of the other independent X variable used to predict Y. Compute Xhatk
Step 2: Regress Y on Xhatk and other X variables

• THe resulting estimate Bk is unbiased
22
Q

What is the nature of the problem when individual observation error terms correlated with one another: serial correlation (failure of classical assumption 4)?

A

The error term for period t is statistically dependent on the error term in a prior peiod

23
Q

What are the consequences of individual observation error terms correlated with one another: serial correlation (failure of classical assumption 4)?

A

Coefficients are unbiased but not efficient, i.e., some other alternative might produce estimates closer tot eh true value of the betas.
- Estimated standard errors are biased

24
Q

How is individual observation error terms correlated with one another: serial correlation (failure of classical assumption 4) diagnosed?

A

Theory: Consider whether the outcomes in one time period are likely to be related to the outcomes in prior time periods.

Empirical Test: Compute the Durbin-Watson d Statistic.

25
Q

What are the uses of Durbin-Watson d Statistic?

A

To diagnose if individual observation error terms correlated with one another: serial correlation (failure of classical assumption 4)?

26
Q

What is the range of the Durbin-Watson d Statistic?

A

d ranges from 0 to 4.

27
Q

How does one interpret the Durbin-Watson d Statistic?

A

If d=2 there is no serial correlation evident in the data. If d less than 2, positive serial correlation. If d is greater than 2, negative serial correlation.

28
Q

What remedies are there for individual observation error terms correlated with one another: serial correlation (failure of classical assumption 4)?

A

Method 1: For first-order serial correlation (et = pe(t-1) + ut) use “Generalized Least Squares”

Cochrane-Orcutt Technique:
Step 1: Run OLS regression of Y on X variables, compute the error terms. Regress et on et-1 to estimate p.
Step 2: Transform the dependent and independent variables as shown in Equation 9.2, run the regression and get the estimated B coefficients.

Method 2: Correct the standard errors for serial correlation using Newey-West standard errors.

29
Q

What is the nature of the problem of individual observation error terms correlated with one another: spatial or intra4class correlation (failure of classical assumption )?

A

The Y values (the outcomes) are statistically dependent on the errors of the others in the same area, class, cluster, or group.

30
Q

What are the consequences of individual observation error terms correlated with one another: spatial or intra4class correlation (failure of classical assumption )?

A

Coefficients are unbiased (assuming no “fixed effects” - which we will discuss next week) but not efficient, i.e., some other alternative might produce estimates closer to the true values of the betas.
- Estimated standard errors are biased.

31
Q

How is the problem of individual observation error terms correlated with one another: spatial or intra4class correlation (failure of classical assumption ) diagnosed?

A

Theory. Consider whether observations are grouped in meaningful ways (e.g., neighborhoods) such that the group has commonalities in their error terms.

32
Q

What remedies are their for the problem of individual observation error terms correlated with one another: spatial or intra4class correlation (failure of classical assumption )?

A

Use “clustered robust standard errors”

May also want to have separate intercepts for each area/class/group. Separate intercepts can be estimated via “random effects” or “fixed effects” specifications. Only “fixed effects” are discussed in this course.

33
Q

What is the nature of the problem with heteroskedasticity (failure of classical assumption 5)?

A

The error term does not have constant variance (which may (or may not) depend on the value of Xk). A violation of the assumption that there is equal variance in the distribution of Y’s and thus in the distribution of Y’s and thus in the residuals. This could result from variance in the error terms being related to an explanatory factor or outcomes?

34
Q

What are the consequences of the problem with heteroskedasticity (failure of classical assumption 5)?

A
• Coefficients are unbiased, but inefficient.

- Estimated standard errors of the coefficients are biased.

35
Q

How is the problem with heteroskedasticity (failure of classical assumption 5) diagnosed?

A

Scatterplots of Y on Xk or error term on Xk.

36
Q

How is the problem with heteroskedasticity (failure of classical assumption 5) diagnosed with the White Test?

A
• Run the regression, obtain the residuals and square these residuals (e^2)
• Run a regression of e^2 on each independent variable, each independent variable squared, and interactions of each independent variable with each other of the other independent variables
• Computer NR^2, where N is the number of variables in the regression and R^2 is the R^2 from the regression in step 2 above.
• See if NR^2 is about the critical value found in Table B-8. If so, reject the null hypothesis of homoscedastic errors.
37
Q

What remedies are there for the problem with heteroskedasticity (failure of classical assumption 5)?

A

Redefine the variables: One method of handling this problem is to take the natural log of Y as the dependent variable.

Alternatively, suppose you are dealing with country or state data. You could define the outcome to be in per capita terms. For example, GDP may have heteroskedasticity present, while GDP per capita does not.

Use”weighted least squares” to fit the regression model instead of ordinary least squares. Weight least squares gives greater weight to the observations with the smallest variance.