What are the two key assumptions required for instruments to be applicable?

1) Intrument relevance (correlation with the x variable)

2) Instrument exogeneity (not related to Y at all)


Why do we use IV regression?

We are worried about 1) reverse causality

2) endogeneity, where an unobserved variable Z impacts both X and Y (for example, cognitive ability being proxied by test scores and also affecting income later in life)


What is the reduced-form regression?

Run a regression of Yi on Zi (the instrumental variable)


What is the first stage regression?

Run a regression of Xi on Zi (the IV)