# 11 Binary dependent variables Flashcards

linear probability model

regression model with a binary dependent variable

Disadvantages of the linear probability model

- Predicted probability can be above 1 or below 0!

- Error terms are heteroskedastic

nonlinear probability models

Pr(Y = 1) = G(Z)

with Z = β0 + β1X1i + ··· + βkXki

and 0≤G(Z)≤1

Probit: G(Z) = Φ(Z)

Using the cumulative standard normal distribution function Φ(Z )

Logit: G(Z) = 1 / (1 + e^{-Z})

Using the cumulative standard logistic distribution function

Remember:

F(z) = Pr(Z ≤ z)

the method used to estimate probit and logit models

Maximum Likelihood Estimation (MLE)

The models are nonlinear in the coefficients, so they can’t be estimated by OLS.

likelihood function

The likelihood function is the joint probability distribution of the data, treated as a function of the unknown coefficients.

maximum likelihood estimator (MLE)

The maximum likelihood estimator (MLE) are the values of the coefficients that maximize the likelihood function.

MLE’s are the parameter values “most likely” to have produced the data.

If Yi is binary, then E(Yi | Xi) =

If Yi is binary, then E(Yi | Xi) = Pr(Yi = 1 | Xi)