RDD Flashcards
(33 cards)
What is the key feature of the sharp RDD design
The conditional probability of receiving treatment jumps directly from 0 to 1 at the threshold.
What about RDD with time index?
To define a RDD design we need to have a score (running variable), a cut off and a treatment. It is usually a crossestional design since there is no time variation.
There should be no time-index t in a pure RDD. There should only be a unit index i.
How should we think about boundary RDDs?
For this to work we need two forcing variables, one for longitude and one for latitude. We then have two dimension, which complicate things. We can’t just look at the shortest distance to the borders. In a “spacing”-design like this, we have a problem with multiple treatments. It should be clear in a RDD what the treatment is! Not a bunch of stuff that changes since we do not know what is causing the different.
In a multi dimensional boundary design, we should have multiple forcing variables (longitude/latitude). We thus also need to control for both of them and all the i interactions between them.
There are some good clear border-discontinuity designs. For example comparing minimum wages between two states. Then it is very clear how far away we are from the border, either measured in a one as a one-dimensional thing or using longitude/latitude which is more flexible. KEY in using this kind of designs is that it should be clear what the border accualy is.
Can we let define control and treatment groups before the cut off in a
This is actually the other way around than what we should do. There should be a cut off that defines the treatment status. This is something that should be criticised a lot since this is not a standard thing!
What is the key identification thing we should worry about in a RDD?
When it comes to a RDD-design, it should be all about smoothness (sorting violates smoothness and so on). Smoothness is the KEY in RDD design. We should not combine this with assumptions like parallel trends etc.
Can we use fixed effects in a RDD?
Using fixed effects in RDD is non-standard. This is a warning since. We should not mix different kinds of identifications. It is highly problematic if we have heterogeneous treatment effects. The estimates will then change by the inclusion of fixed effects.
Studying BW-graphs, what should we look out for?
Both estimates and CI that is varring.
What is the most important thing to look at when reading a research rapport?
When we should write a report, the most important thing is to look at the design. Does it really fit the empirical framework?
How about using multiple cut off’s and pulling them together in a RDD?
This is a non-standard RDD approach. It would be more OK for each of the topics to estimate a separate RDD. It is NOT OK to pull everything together. Pulling together different cut offs, creates multiple forcing variables.
By normalizing the running variable so it becomes one, we will create bunching in the forcing variable. When we have discreteness, we do not have smoothness. This is maybe not a problem if we have very many values, but if we have few then we will have a valuation of smoothness, we then need to extrapolate
Briefly explain RDD with local-randomization estimation.
We have not discussed local-randomization estimation but it is like thinking about the design as a RCT really close to the threshold. Then we estimate the means in the treatment and control group and compare them. This is credible since we are so close to the threshold.
For this to be true, the slope of the forcing variable have to be constant on both sides of the threshold. The lines should be completely flat.
Here there is a big problem with SE’s since we need to use randomized inference.
How should we present BW choice in our RDD paper according to Per?
In a RDD we have to many choices. A researcher that can choose a BW will of course choose a BW that confirm his estimates. We should therefore instead let the data speak. We should show all the estimates for different BW’s (including) the optimal in a graph. This will then show if it matters if we are choosing the optimal or not.
What is a good way of presenting covariate balance in a RDD paper?
Canay and Kamat suggest that we compare the distribution (CDF) of the guys on the right and on the left of the threshold.
What is a McCrary test?
Testing for bunching in the forcing variable = testing the continuity in the density of the running variable. We test this by comparing the values of an estimated density on the right and the left och the cutoff. See rddensity
What is the difference by window and bandwidth in a RDD?
- Window = what we are looking at to see our cutoff etc. We pick window size so it looks good. This is thus totally irellavant for estimating the effect. This is just for visualization.
- Bandwidth = how many data points we use around the the threshold when estimating the effect. This is not what we show.
What should we do according to Goldsmith-Pinkham if we can’t shrink our BW, that is, if we have a discrete running variable?
In this case we do not have that much points just at the cutoff so we will definitely have a bias. We should use another RD command called RDHonest.
What can we do if we have bunching according to Goldsmith-Pinkham?
Gerard, Rokkanen and Rothe (2020) propose a partial identification approach to allow for the possibility of bunching. This is check on the robustness of our results. How sensitive are our results to the manipulation?
Their approach hingest on the assumption that people only manipulate their result in one direction. Those manipulators will mask our treatment-effect. This paper then derives “shap-bounds” for those who could be affected by the treatment. We identify the share of the masking individuals. This is a nice thing to also add in addition to the Mcrary test.
What is a regression Kink Design?
When we cross the threshold, the relationship between the outcome variable and the running variable changes rather then jumps. In for example public finance there are many policies that creates linear shifts in incentives rather than discontinuous jumps.
What is the biggest threat to a RDD?
The most important threat to any RD design is the possibility that units are able to strategically and precisely change their score to be assigned to their preferred treatment condition (Lee, 2008; McCrary, 2008), which might induce a discontinuous change in their observable and/or unobservable characteristics at or near the cutoff and thus confound causal conclusions.
What is a big draw back from the RDD?
An important limitation of RD designs is that they have low external validity. In the absence of additional assumptions, it is not possible to learn about treatment effects away from the cutoff—that is, to extrapolate the effect of the treatment.
Why is local polynomial estimation better than a global approach?
Local polynomial methods are preferable to global polynomial methods because they avoid several of the methodological problems created by the use of global polynomials such as erratic behavior near boundary points (Runge’s phenomenon), counterintuitive weighting, overfitting, and general lack of robustness.
What is important when doing placebo cut off-tests?
Importantly, in order to avoid treatment effect contamination, this validation method should be implemented for units below and above the cutoff separately.
What is the RDD checklist
- Is there actually a jump?
- [ ] Show/picture the running variable without regression lines and cut off line to show that there is a clear jump.
- No bunching in the running variable
- [ ] Show density graph
- [ ] Run a McCrary density test
- Potential outcome is continuous
- [ ] Show RDD graph with bin-averages
- [ ] Placebo checks at other thresholds
- Pre-determined covariates (characteristics) are balanced around the threshold
- [ ] Continuity checks (graphical or formally) for other covariates
- BW sensitivity checks
- [ ] Show graphically how estimates and CI changes with different BW. Show explicitly what the optimal is.
- Polynomial sensitivity checks
- [ ] How?
- Presentation sensitivity
- [ ] Show main graphs with other bin-suggestions and show that there is still a clear jump.
- What happens if we lose observations or clusters?
- Is it possible to generalize the LATE?
- No other policy changes by the threshold?
- Are the there other behavior effects at the threshold that affects the result? E.g barely passing the threshold might make you worse of.
- No controls other than the forcing variable?
- Discussion of the bias/variance trade of around the threshold.
- Not mixing identifications! No fixed effects in the model etc.
What is the best practice in a RDD design?
It is now understood that best practices for estimation and inference in RD designs should be based on tuning parameter choices that are objective, principled, and data-driven, thereby removing researchers’ ability to engage in arbitrary specification searching, and thus leading to credible and replicable empirical conclusions.
What should we think of regarding power in a RDD design
We need many observations around the threshold