Week 5 Flashcards

1
Q

Linear Probability Model (LPM)

A

P(yi = 1) = pi = E(yi) = xi’β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the problems with LPM?

A

1) Distribution of error terms not normal (discontinuous)
2) Heteroskedasticity
3) OLS is unbiased and consistent, but inefficient
4) OLS ignores restriction 0<=pi<=1 s.t. fitted values can lay outside [0,1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the transformation from linear to non-linear model for p?

A

P(yi = 1) = F(xi’β)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Logit model

A

εi ~ LOG(0,1)

F(xi’β) = 1 / (1 + exp(-xi’β))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Probit model

A

εi ~ N(0,1)

F(xi’β) = Φ(xi’β)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between the logit and probit model?

A

β(logit) ≈ 1.8β(probit)

σ(logit) = π/3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can the parameters of logit/probit models be estimated?

A

ML: yi~Bernoulli(pi)

f(yi) = 1 - pi, if yi = 0
= pi , if yi = 1

L(β) = f(y1, ..., yn) = Πf(yi) = Πpi^(yi)(1-pi)^(1-yi)
ℓ(β) = Σyilog(pi) + (1-yi)log(1-pi)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the ML estimate’s distribution and properties?

A

β^ ~ N(β, V^)

V^ = (Σpi(1-pi)xixi’)^(-1)

Consistent and asymptotically normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is identification?

A

A vector of parameters is identified by a given data set, model, and estimation method if, for that data set, the estimation method provides a unique way to estimate the parameters in the model.

Unless there is collinearity, parameters in the linear model are always identified

Identification issues commonly arise in non-linear models; if not identified it is impossible to pin down the estimates of the true value -> impose restrictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 2 identification issues?

A

1) The variance σ^2 of the latent variable is unidentified

2) The threshold T is unidentified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the marginal effect of the logit, profit and LPM model?

A

LPM: βj
Logit: P(yi=1)P(yi=0)βj
Profit = Φ(xi’β)βj

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Odds Ratio

A

P(yi=1) / P(yi=0)

Describes the importance of x in terms of determining the probability of outcome 1 in terms of the probability of outcome 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Odds Ratio - Logit

A

P(yi=1)/P(yi=0) = exp(xi’β)

0 -> two outcomes equally likely
+ -> outcome 1 more likely
- -> outcome 0 more likely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If all variables are demeaned, what does the sign of β denote?

A

Average preferred choice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Diagnostics of logit/probit models

A

Residuals: ei = yi - F(xi’β) = yi - pi^

Standardized residuals: ei = (y - pi^) / sqrt(pi^(1-pi^)

McFadden R^2: 1 - ℓ(β^)/ℓ(β0^) => 0<=R^2<=1 (β0^ is model with only intercept)
larger value => more that the regressors explain outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Hit Rate

A

h = p00 + p11 (when realization = prediction)

17
Q

How to test whether predictions of model are better than “random”?

A

q = p1^2 + (1-p1)^2

h -q/sqrt(q(1-q)/n) ~ N(0,1)

Stat large => predictions better than random