Exam 1 Answers Flashcards
(22 cards)
Draw a fork DAG. Label the variables and explain the essence of a fork DAG with an example.
Example:
* smoking (π) β lung cancer (π ), but not the other way around (no reverse causality)
* genes (π) β smoking (π), but not a two-way relationship (violates acyclic characteristic of DAG)
* genes (π) β smoking (π ): it is a confounding relationship
The fork explains a relationship of common-cause confounding
What is a mediator? Explain and draw the DAG
The mediator is π in the above DAG. Mediation is the process by which a third vari-
able (i.e., the mediator) transmits the effect of an independent variable on a dependent
variable.
To better understand mediators, letβs consider the effect of attending an elite high school
(π) on going to an elite university (π ). A key reason/mechanism why attending an elite
high school usually enables a student to go to an elite university is because elite high
schools have better teachers. We denote better teachers with π , not π. We do so
because better teachers represent a pathway mechanism/mediator, not a confounder,
through which students can get into an elite university. In short, the three variables of
interest are:
* π: Attending an elite high school (main independent variable/treatment/exposure)
* π : Going to an elite university (dependent variable/outcome)
* π : Better teacher in the elite high school (mediator)
For π to be a mediator, we need to meet two core essential characteristics:
1. Elite high school (π) β elite university (π ). On average, students who attend elite
high school more frequently go to elite universities. This is very uncontroversial.
2. Better teachers (π ) must be on the path from elite high school (π) to elite
university (π ): that is, π β π β π . The fact that better teachers lead to better
student academic outcomes is also very uncontroversial.
We also need to avoid reverse causality, yielding a third essential characteristic:
3. Elite university (π ) β elite high school (π). Students canβt get into university
without first going to high school, so such a path is impossible.1 More technically,
the no reverse causality assumption in mediation analysis with three variables is
called βsequential ignorabilityβ.
What is a direct effect, and how does it relate to mediation
A direct effect refers to the mediation effect produced every other mediator except π. In the example from question 2, it would be the every other reason besides good teachers why going to an elite high school helps with getting into an elite university
In the language of DAGs, what does it mean to close all relevant backdoor paths?
It means to only close the backdoor paths associated with confounders, not colliders or mediators β i.e., assuming an interest in some form of an average treatment effect as the estimand.
What is a collider? Draw the DAG and explain.
Colliders (πΆ) are variables that, if adjusted for, can introduce a spurious relationship
between π and π
What is selection bias and how do you indicate it in a DAG?
Selection bias is when you have of availability of data in your sample that does not represent the population for which you are making your inference. You indicated selection bias with π
Draw a DAG with selection bias on the dependent variable. What can we learn in such instances? Explain with an example.
In our smoking example, sample selection bias on the dependent variable (π ) entails
having a sample of either people mostly with lung cancer or mostly without lung cancer.
In either case, it is diο¬icult to make any within-sample inference with respect to causality or prediction, because there is not enough variation in people. For external validity, we canβt make inferences to a larger population if we donβt have (1) data representative of that population; or (2) variables in our sample to adjust for the sample-population differences.
Now, you may be wondering: even if we donβt have variation in the dependent variable, people with lung cancer (π ), canβt we just figure out if the independent variable, smoking (π), changes π in some way? Technically, we can. As Brady and Collier (2010) highlight, selection on the dependent variable (π ) is core to the multiple case study approach. To that end, qualitative researchers often select some cases where π is stronger, others where π is weaker, and analyze the mechanisms/reasons explaining π under each scenario. As some of you may have noticed, the multiple qualitative case study approach resembles mediation analysis and front door adjustment. The catch is that the qualitative approach is not as inferentially robust if we wish to generalize or transport results beyond the sample. As Bennett (2006) and Brady and Collier (2010) argue, it can be valuable to learn more about specific cases, even if external validity is not the goal. However, mediation analysis is a better option if we wish to generalize or transport results, because we can rely on the central limit theorem and the law of large numbers. By contrast, the set of potential outcomes are limited with the smaller samples in qualitative analysis
You are presented with a regression in which the author uses female literacy as the independent variable (π) and overall literacy (π ) as the dependent variable. Does this seem like a valid set up? Why or not?
No, it is not a valid set up. Clearly, female literacy is part and parcel of overall literacy.
Accordingly, there is no point in running a regression of something that already partly
explains something else by design
What is positivity in the context of internal validity? Provide an example to show that you understand it. It may also be helpful to draw something
Whether the different manifestations of the independent variable (π) overlap across subgroups/strata of the treatment and control, taking into account selection bias ( π ) that can result in under- or over-coverage
What is positivity in the context of external validity? Provide an example to show that you understand it. It may also be helpful to draw something
Whether the different manifestations of the independent variable (π) and effect modifiers (π ) overlap across the sample and population, taking into account
selection bias ( π ) that can result in under- or over-coverage.
What are generalizability and transportability?
Generalizability is when the sample is embedded within the population of interest, and transportability is when the sample corresponds to another population of interest.
What is an Intent to Treat (ITT) effect?
The effect of assigning the treatment, even if people did not comply with their treatment assignment. The ITT is often a very conservative estimand.
What are the treatment (π), instrument (π), dependent variable (π ), and confounder (π) in Column (2) of Table 4 of Acemoglu, Johnson, and Robinson (2001)? Also, draw the DAG
- π: average protection against expropriation risk (quality of institutions)
- π: settler mortality in colonial era
- π : log GDP per capita
- π: latitude
What is the exclusion restriction in Acemoglu, Johnson, and Robinson
(2001)
Settler mortality at the time of colonization (π) must not be directly related to current
GDP per capita (π ). If settler mortality at the time of colonization were related to current GDP, it would violate the exclusion restriction assumption.
Explain two-stage least squares in instrumental variables. Show the math
The idea is to use the exogenous variation in the first stage between the instrument (π)
and the treatment (π) to overcome the endogenous/reverse causal relationship between
the treatment (π) and outcome (π ). We attain the exogenous variation in the first
stage by putting π as the dependent variable:
ππ = πΌ + π½1ππ + ππ (1)
where πΌ is an intercept and π is an error term. Note how the independent variable (π)
serves as the dependent variable in the first stage. Now that we have the first stage
figured out, letβs specify the second stage is: ππ = πΎ + πΏ Μ ππ + ππ (2)
Above, the hat on top of π indicates that we are not using all of π; we are only using the
predicted variation from the first stage. You may also note that we have used πΎ (Gamma)
for the intercept and πΏ (delta) for the variable coeο¬icients to prevent confusion
What should you look for in the first stage of instrumental variables estimation
F-stat above 10 or 11 (suο¬icient for full points)
* High π
2 and/or high correlation (helpful for external validity, but only worth half points)
Draw the two potential DAG(s) for standard natural experiments and explain your rationale
The experiment DAG is possible because π is irrelevant is π is randomly assignedβeven if the researcher doesnβt control it in a natural experiment. The fork DAG is possible because the as-if random assignment is not always perfect in natural experiments, so it is often necessary to control for π.
Explain in detail at least one limitation of standard natural experiments
- Nature doesnβt assign every treatment as-if randomly:
a. Randomized experiments have similar problem:
* Not everything can be randomly assigned for ethics or feasibility reasons
b. Many major social science phenomena not assigned as-if randomly:
* Democracy/autocracy (political science)
* Social capital (sociology)
* GDP (economics) - Overclaiming
a. Everyone wants to say that their treatment is as-if randomly assigned
* It makes life easy when we can ignore the confounding (π) variables, but
that is frequently not the case
b. Standard natural experiments are hard to verify
* They necessitate great qualitative knowledge of treatment assignment
CPOs. Otherwise, if we canβt show that the treatment assignment is as
good as random through qualitative knowledge of the case, it is really
not a natural experiment.
Balance tables are the main method that we have to show that observations are assigned as-if randomly, but balance tables are only useful if (i) we have access to all of the potential π variables; and (ii) they are measured correctly. Often, we canβt meet those two criteria.
What the main distinction between a standard natural experiment and a field experiment?
The researcher controls the randomization device in a field experiment, whereas nature or happenstance makes the randomization in standard natural experiments
What is ignorability in the context of internal validity?
No unmeasured confounders.
What is ignorability in the context of external validity?
No unaccounted for selection biases that generate difference between the sample and population in terms of effect modifiers
What are strata?
A subgroup that divides the population for meaningful analysis. Technically, strata are
non-overlapping as well.
(Note to grader: I didnβt stress the non-overlapping part in my lecture notes, so you can give full points for the first part)