Generalized linear models Flashcards
(8 cards)
Link and response functions
η = x’β = g(μ) link
μ = h(η) response
Grouping data process and pros
When I have same covariate patterns, I create G group, each with ni observations and I model E(y-i) instead of E(yi)
- I can estimate model fit;
- I can estimate overdispersion.
Overdispersion estimation and resulting coefficients
When I can group data: Var(y-i) = φ * theoretical Var because of unobserved heterogeneity of positive correlation between responses (coming from clusters).
- φP = 1/(G-p) χ2
- φ = 1/(G-p) D
It gives a quasi-likelihood approach where β^ .~ N(β , φ F-1(β^)), affecting the significance. AIC and other likelihood quantities cannot be estimated.
Maximum likelihood estimation requirements
- Conditional independence: yi independent of yj | X ∀ i, j
- Invertibility of F(β) ∀ βs, equivalent to rank(X)=p
Fisher scoring algorithm and results
^β(t+1) = ^β(t) + F-1(^β(t))s(^β(t))
^β .~ N(β, F-1(^β)) with Var(^βj) = [F-1(^β)]jj
Observed and expected information matrices
H(β) = - d2l(β) / dβdβ’
F(β) = E(H(β)) = cov(s(β)) = E(s(β)s(β)’)
- F(β) = H(β) for binary logit model
Hypothesis testing
H0: Crxpβpx1 = drx1
- Both models: LR = -2[l(^βH0)-l(^β)] .~ χ2r
- Only initial: W = (C^β-d)’[CF-1(^β)C’]-1(C^β-d) .~ χ2r
- Only restricted: u = s(^βH0)’F-1(^βH0)s(^βH0) .~ χ2r
Model quality and selection
Comparing fitted model to saturated one:
- Pearson .~ χ2G-p
- Deviance: χ2 = -2 Σ [li(^μi) - li(y-i)] .~ χ2G-p
Comparing models (no quasi-):
- AIC = -2l(^β) + 2p (min)
- Nagelkerke’s R2