CHAPTER 10 Controlling for Confounders Flashcards

(84 cards)

1
Q

What is the main purpose of controlling for confounders?

A

To mitigate bias arising from confounders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the most common method to control for a confounder?

A

Including it in a regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False: Controlling for confounders eliminates all bias in a study.

A

False.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What should we typically control for, confounders or mechanisms?

A

Confounders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does controlling involve in statistical analysis?

A

Finding the correlation between two variables while holding other variables constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In the context of U.S. Congress, which party is more likely to vote conservatively?

A

Republicans.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does a higher ACU score indicate?

A

A more conservative voting record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What was the average ACU score for Republicans in 1997 according to the data?

A

83.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What was the average ACU score for Democrats in 1997 according to the data?

A

19.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How much more conservatively do Republicans vote compared to Democrats on average?

A

64 ACU points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a potential confounder that affects both party membership and voting records?

A

Personal ideology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What survey was administered to congressional candidates to measure personal ideology?

A

National Political Awareness Test (NPAT).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does controlling for personal ideology mean in this context?

A

Comparing voting records of Democrats and Republicans with similar NPAT scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does Table 10.2 illustrate about the difference in voting records after controlling for ideology?

A

The difference diminishes significantly compared to the unadjusted difference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the unit of analysis in the regression model discussed?

A

An individual representative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the coefficient β1 in the regression represent?

A

The correlation between ACU score and being a Republican, controlling for personal ideology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What was the estimated value of β1 from the regression on the data?

A

24.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why might the estimate of the causal effect of party discipline still be questionable?

A

Due to the presence of other confounders beyond personal ideology.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a heterogeneous treatment effect?

A

When the effect of a treatment varies across different units of observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why is it important to consider heterogeneous treatment effects when controlling for confounders?

A

It can change the subset of units for which we estimate the average effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are ATE and LATE in the context of treatment effects?

A

ATE is average treatment effect; LATE is local average treatment effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does ATE stand for?

A

Average Treatment Effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does LATE stand for?

A

Local Average Treatment Effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

True or False: LATE and ATE are always the same.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are the key ingredients in any regression for causal inference?
* Dependent variable * Treatment variable * Control variables
26
What is a dependent variable?
The outcome you are trying to understand
27
What is a treatment variable?
The feature of the world whose effect on the dependent variable you are trying to estimate
28
What are control variables?
Potential confounders included in the regression to reduce bias
29
In the regression equation, what do the parameters α, β, and γ represent?
* α: intercept * β: effect of the treatment * γ: effect of the control variable
30
What does the error term ε represent in a regression?
Idiosyncratic factors reflecting differences from predicted outcomes
31
What does BLACEF stand for?
Best Linear Approximation to the Conditional Expectation Function
32
True or False: OLS regression provides the best linear approximation without knowing the data-generating process.
True
33
What happens if there are no baseline differences across values of T after controlling for X?
BLACEF corresponds to the average effect of T on Y
34
What does the omitted variable bias formula quantify?
The bias associated with failing to include a confounder in regression
35
What is the formula for omitted variable bias?
β S − β = π · γ
36
What does π represent in the omitted variable bias formula?
The correlation between T and X
37
What does γ represent in the omitted variable bias formula?
The effect of the control variable on the outcome
38
If an unobserved confounder is positively related to both T and Y, what is the sign of the bias?
Positive bias
39
What can cause an under-estimate of the effect of T?
If the confounder is positively related to T but negatively related to Y
40
How does controlling for a variable (X) affect the relationship between T and Y?
It changes the estimated relationship if X is correlated with T and has an independent relationship with Y
41
What is a potential confounder when regressing income on height?
Gender
42
What does running separate regressions for men and women allow us to see?
The correlation between income and height separately for each gender
43
What happens to the slope when separate regressions for men and women are run?
The slope is greater for men than for women
44
What is the purpose of running a regression of income on both height and gender?
To obtain a summary estimate of the correlation between income and height, controlling for gender
45
What is the relationship between the slopes of two lines when controlling for gender?
The slopes are identical, representing a weighted average
46
What does the intercept of the regression line for women represent?
Predicted income for women who are 5 feet tall
47
What is the predicted income for women who are 5 feet tall?
The predicted income for women who are 5 feet tall is represented by the intercept of the regression line for women. ## Footnote This intercept is a key parameter in the regression model.
48
What is the predicted difference in income between men and women of the same height?
The predicted difference in income between men and women of the same height is represented by the slope of the two regression lines. ## Footnote This slope indicates how income varies with height when controlling for gender.
49
What is the average relationship between height and income, controlling for gender?
The average relationship between height and income, controlling for gender, is approximately 8.1. ## Footnote This value is derived after controlling for the confounding effect of gender.
50
What was the previous estimate for the slope of the relationship between height and income before controlling for gender?
The previous estimate for the slope was 14.8. ## Footnote This estimate was corrected to 8.1 after accounting for the confounding influence of gender.
51
What does controlling for a confounder do to the precision of estimates?
Controlling for a confounder can either improve or harm the precision of estimates. ## Footnote The effect on precision depends on the correlation of the control variable with the outcome and treatment.
52
What is p-hacking?
P-hacking is the practice of trying control variables until achieving a statistically significant estimate. ## Footnote This practice is discouraged as it can lead to misleading results.
53
What is the NPAT score in the context of the congressional politics example?
The NPAT score is a continuous measure of personal political ideology used as a confounder in the analysis. ## Footnote This score helps to control for ideology in the regression of ACU score on party affiliation.
54
What does the regression of ACU Rating on NPAT Conservativeness score aim to achieve?
The regression aims to control for personal ideology and estimate the effect of party affiliation on ACU Rating. ## Footnote It results in a continuous measure of the relationship between ideology and party.
55
What does the gap between the two regression lines represent in the context of ACU ratings?
The gap between the two regression lines represents the difference in predicted ACU Rating between Republicans and Democrats for a given NPAT score. ## Footnote This allows for a nuanced understanding of party differences across ideological lines.
56
What are the conditions for controlling to yield an unbiased estimate of a causal effect?
To yield an unbiased estimate, all confounders must be controlled for, and there must be no reverse causality. ## Footnote This highlights the challenges in achieving true causal inference in observational studies.
57
What is reverse causation?
Reverse causation occurs when the outcome affects the treatment, complicating causal interpretations. ## Footnote It emphasizes the difficulty in establishing clear cause-and-effect relationships.
58
What was the main finding of the study on social media usage and subjective well-being?
The experimental estimate of Facebook usage's effect on subjective well-being was about one-third the size of the estimates from the simple correlation. ## Footnote This suggests that controlling for confounders still leads to an over-estimate of the true effect.
59
What is the significance of a regression table?
A regression table summarizes the results of regression analyses, including coefficients, standard errors, and statistical significance. ## Footnote Understanding regression tables is crucial for interpreting the results of statistical analyses.
60
What does the first column of a regression table typically contain?
The first column of a regression table typically contains labels for the variables involved in the regression. ## Footnote This helps in identifying which variables are included in the analysis.
61
How is statistical significance indicated in a regression table?
Statistical significance is indicated by stars next to the coefficient estimates in the regression table. ## Footnote This helps to quickly identify which results are statistically reliable.
62
What is the initial ACU rating for Republicans before controlling for NPAT score?
64.32 ## Footnote This is the unadjusted rating before considering NPAT categories or scores.
63
What happens to the ACU rating for Republicans when controlling for NPAT categories?
Drops to 23.74 ## Footnote This indicates a significant reduction in the estimated effect.
64
What is the ACU rating for Republicans when controlling for the continuous NPAT Conservativeness score?
24.28 ## Footnote This value reflects the influence of a more nuanced measure of NPAT.
65
What does the r-squared statistic represent in regression analysis?
The proportion of variation in one variable predicted by other variables ## Footnote A higher r-squared indicates a better fit of the model to the data.
66
How many observations were included in the regression analysis?
349 ## Footnote This number reflects the congresspeople who completed the NPAT survey in 1997.
67
What is the coefficient estimate for the NPAT category 81-100 in relation to ACU rating?
59.77 ** ## Footnote This coefficient indicates a significant positive relationship.
68
True or False: A high r-squared statistic alone guarantees understanding of causal relationships.
False ## Footnote A high r-squared does not imply that all confounders have been controlled for or that the model is correctly specified.
69
What is a confounder in the context of regression analysis?
A variable that affects both the treatment and the outcome ## Footnote It can introduce bias if not controlled for.
70
What is the challenge when a variable is both a confounder and a mechanism?
It complicates the decision on whether to control for it ## Footnote This is due to its dual role affecting both the treatment and the outcome.
71
What is the local average treatment effect (LATE)?
The average treatment effect for a specific subset of the population ## Footnote This concept helps in understanding causal effects in targeted groups.
72
Fill in the blank: Controlling is a way to account for _______ and obtain better estimates.
[confounders] ## Footnote Controlling helps reduce bias in estimating causal relationships.
73
What is the purpose of matching in statistical analysis?
To control for confounding variables by pairing treated and untreated units with similar characteristics ## Footnote This technique allows for comparison while accounting for observable differences.
74
What is omitted variables bias?
The bias resulting from failing to control for some confounder ## Footnote This can lead to incorrect estimates of causal effects.
75
What is a dummy variable?
A variable indicating whether a unit has a particular characteristic (1 for yes, 0 for no) ## Footnote These are often used in regression models to represent categorical data.
76
What is the treatment variable in regression analysis?
The variable representing the feature whose effect on the dependent variable is being estimated ## Footnote It is crucial for understanding causal relationships.
77
What is a dependent or outcome variable?
The variable in the data that corresponds to the feature being explained or predicted ## Footnote It is the primary focus of the analysis.
78
What should researchers be cautious about when controlling for variables?
Unobservable confounders, reverse causation, and confounders that are also mechanisms ## Footnote These factors can still lead to biased estimates despite controlling for other variables.
79
What is the main goal of controlling variables in regression?
To generate more credible estimates by comparing treated and untreated units with similar characteristics ## Footnote This helps in approximating causal effects more accurately.
80
What is the main focus of the study by Allcott et al. (2020)?
The Welfare Effects of Social Media ## Footnote Published in the American Economic Review, the study investigates how social media impacts subjective well-being.
81
What significant trend was observed in response rates during elections?
Response rates declined considerably in subsequent elections ## Footnote This decline is why data from the late 1990s is being presented.
82
What is the NPAT category mentioned in the text?
There are five NPAT categories ## Footnote One of these categories must be omitted in analysis, specifically the 1st–20th percentile.
83
Why can’t both Democrat and Republican variables be included in the regression?
Every member is either one or the other ## Footnote This prevents separate identification of their effects.
84
What does including a Republican variable in the regression imply?
It interprets the coefficient as the effect of being a Republican versus being a Democrat ## Footnote This approach simplifies the analysis by focusing on one variable at a time.