Exam 2 Quizzes Flashcards
(15 cards)
- Interpret the goals coefficient in the following regression.
estimate = feols(
bd_points_log ~ goals + assists + wc_win | club,
data = df,
cluster = “club”)
summary(estimate)
OLS estimation, Dep. Var.: bd_points_log
Observations: 189
Fixed-effects: club: 33
Standard-errors: Clustered (club)
Estimate Std. Error t value Pr(>|t|)
goals 0.176069 0.034281 5.136076 1.3377e-05 *
assists -0.063460 0.084675 -0.749447 4.5906e-01
wc_win 0.303101 0.109511 2.767761 9.3052e-03 **
—
Signif. codes: 0 ‘’ 0.001 ‘**’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ‘ 1
RMSE: 0.93151 Adj. R2: 0.032904
Within R2: 0.075484
For each additional goal scored, the Ballon d’Or points outcome increases by 18 percentage points for the same club, holding other controls fixed. The “for the same club” part is necessary to add given that the addition of fixed effects makes the estimate specific to each club, not between clubs. Technically, given that the dependent variable is a log, we would want to exponentiate the 0.18 coefficient to attain 0.20 (𝑒0.18 = 0.20), but the 18 percent number is generally considered close enough.
What happens to the intercept in least squares dummy variable (LSDV) regression?
It is no longer the starting point for all observations. Instead, it is the starting point for the reference category, which is the dummy variable that R leaves out of the regression to avoid collinearity.
What does the bind_rows() command do in R, and why is it so useful?
It stacks observations on top of each other, based on the variable/column names being named the same. In contrast to rbind(), bind_rows() does not require the variables in each dataset to be in the same order
Complete Pooling
Complete pooling: It happens with no fixed effects or random intercepts, but
complete pooling can include adjustments for standard errors. There is only one
intercept in the model from which every estimate is based.
No Pooling
- No pooling: it happens in fixed effects models, most visibly through least-squares dummy variable (LSDV) regression with different estimates for each group.
Partial Pooling
- Partial pooling: It happens in random intercept or random slope models. Based
on a normal distribution, the random effects model shrinks each group/club’s es-
timate toward toward the overall/grand mean, especially when groups have few
observations or are outliers. By doing so, random effect estimates from multi-
level models shares information about variance between groups, avoids overfitting,
and often yields more stable within-group estimates than modeling each group’s
intercept independently like in fixed effects
Interpret the goals coefficient in the following regression.
model1 <- lmerTest::lmer(
bd_points_log ~ goals + assists + wc_win + (1 | club),
data = df)
summary(model1)
Linear mixed model fit by REML. t-tests use Satterthwaite’s method [
lmerModLmerTest]
Formula: bd_points_log ~ goals + assists + wc_win + (1 | club)
REML criterion at convergence: 550.6
Scaled residuals:
Min 1Q Median 3Q Max
-1.5707 -0.5142 -0.2275 0.3032 4.5095
Random effects:
Groups Name Variance Std.Dev.
club (Intercept) 0.04128 0.2032
Residual 1.00180 1.0009
Number of obs: 189, groups: club, 33
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.46846 0.11278 51.73985 13.021 < 2e-16 *
goals 0.15776 0.04782 184.52401 3.299 0.00116 **
assists -0.03443 0.09951 182.88801 -0.346 0.72977
wc_win 0.31372 0.18040 183.94723 1.739 0.08370 .
—
Signif. codes: 0 ‘’ 0.001 ‘**’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Correlation of Fixed Effects:
(Intr) goals asssts
goals -0.424
assists -0.125 -0.338
wc_win -0.338 0.013 -0.013
For each additional goal scored, the Ballon d’Or points outcome increases by 15.7 % for the same club, holding other controls fixed. The “for the same club” part is necessary to add given that the addition of random effects makes the estimate specific to each club, not between clubs. Technically, given that the dependent variable is a log, we would want to exponentiate the 0.157 coefficient to attain 0.17 (𝑒0.157 = 0.17), but the 16 percent number is generally considered close enough.
Which causal inference assumption does common support refer to? How would you show if something has common support?
Common support refers to the positivity assumption. You can show common support through overlapping density plots in which one line is treatment and the other line is control. If the two lines are overlapping, then there is common support.
Which causal inference assumption does balance refer to? How would you show if something has balance
Balance refers to the ignorability assumption. You can show balance through a balance table, which depicts the difference between the treatment and control distributions of each confounding 𝑍 variable. If p-values indicating the difference between these distributions are below .05, or if the standardized mean differences are above 0.1, then there is little balance
Should we use propensity scores for matching or weighting? Please provide a one-word answer
Weighting
What is the basic intuition behind the matching frontier? Explain in words and draw the graph(s).
The idea is that there is trade-off between balance and sample size. As you decrease the sample size, you increase balance—but only to a point. At that point, there isn’t much utility from a balance/causal inference perspective from dropping more units. From an inference perspective, the idea is to estimate the effects for all points along the frontier to get a sense of the external validity of the estimates. This way, we can see if the estimates to which samples are picked.
When estimating any difference-in differences using regression, what is the coefficient of interest? Provide an example to show your knowledge.
The coefficient of interest is 𝛽(𝑇 𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡×𝑇 𝑖𝑚𝑒), such as in the following regression:
𝑌 = 𝛽0 + 𝛽1(𝑇 𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡) + 𝛽2(𝑇 𝑖𝑚𝑒) + 𝛽3(𝑇 𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 × 𝑇 𝑖𝑚𝑒) + 𝜖
Take Card and Krueger (1994). In this study, the treatment is New Jersey (where the
minimum wage law got passed), and we are interested in the post-treatment period.
Thus, in Card and Krueger (1994), the main coefficient of interest is 𝛽(𝑁 𝐽 × 𝑃 𝑜𝑠𝑡 −
𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑝𝑒𝑟𝑖𝑜𝑑) in the following regression:
𝐸𝑚𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡 = 𝛽0 + 𝛽1(𝑁 𝐽 ) + 𝛽2(𝑃 𝑜𝑠𝑡 − 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑝𝑒𝑟𝑖𝑜𝑑) + 𝛽3(𝑁 𝐽 × 𝑃 𝑜𝑠𝑡 −
𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑝𝑒𝑟𝑖𝑜𝑑) + 𝜖
What does staggered adoption mean in the context of difference-in-differences? Be specific and provide an example
Staggered adoption simply means that the exposure to the treatment takes place for different units at different time periods. In Motolinia (2021), not all legislative districts in Mexico adopted the lifting of the ban on reelection at the same time.
Explain the difference-in-differences strategy in Helms (2024) in detail
Helms (2024) uses the end of the of the Multi-Fibre Agreement (MFA) in 2005 as an exogenous shock/treatment. Then, he interacts that post-treament time dummy variable with the district-level employment levels in the textile sector in 2004. The idea is that having high-level of textile employment will increase riots in the post-treatment period.