Models for Count Data II Flashcards

Question

On what scale are the coefficients from the negative binomial regression?

Answer 1

Log-scale. As with Poisson, they can be exponentiated to get the IRRs and 95% CIS for IRRs

Answer 2

IRRs are more interpretable than coefficients as IRRs are on the scale of the count variable

Answer 3

Log of the alpha and only needs interpreting if using predictors to predict dispersion rather than assuming it's constant

Answer 4

Using Stata to exponentiate the coefficients

Answer 5

No - neither estimated coefficients nor SEs will be the same

Answer 6

Zeroes being unobservable or 'impossible' e.g.,: - Number of days in hospital for hospitalised stroke patients - Number of appointments with a psychotherapist

Answer 7

Essentially in the same way as ordinary regression models - There is zero-truncated Poisson and zero-truncated negative binomial regression. In either of these two models, predicted values will have a minimum value of 1. Otherwise, the interpretation of coefficients is the same as in ordinary Poisson or negative binomial regression

Answer 8

No - may not have observed zeroes 'by chance', even though zeroes are impossible

Answer 9

Knowledge about how the data were collected, rather than based on noticing that there are no zeroes

Answer 10

- Zero-inflation: where zeroes can about in two different ways - Hurdle models: Where the zeroes and the non-zero counts are caused by separate processes

Answer 11

Structural zeroes vs sampling zeroes

Answer 12

- Structural zeroes: non-smokers of joints - Sampling zeroes: cannabis users who happened to not smoke last week

Answer 13

All zeroes are assumed to be structural, and there are no sampling zeroes The distribution of non-zero counts is zero-truncated, and the zeroes are governed by a totally different process E.g., number of appointments with a psychotherapist after GP referral: - Structural zeroes: Some patients never go to see a therapist - Zero-truncated counts: Those who go to see a therapist have at least one appointments

Answer 14

One distribution governs zeroes and another governs the counts (zero-truncated Poisson counts)

Answer 15

Equidispersion: - Zero-inflated Poisson - Poisson hurdle model Overdispersion: - Zero-inflated negative binomial - Negative binomial hurdle model

Answer 16

Often depends on knowledge/theory of how zeroes come about Sometimes the zero-generating process is unknown; then a pragmatic decision might be made (e.g., based on model fit)

Answer 17

Large counts

Answer 18

P(Yi = 0) = π + (1 - πi)e^-μi P(Yi = y) = (1 - πi)μi^ye-μi / y! , y ≥ i The first equation describes the probability of observing zero events. This probability is the sum of the probability of a structural zero (πi) and the probability of a sampling zero [(1 - πi)e^-μi The second equation describes the probability of observing 1, 2, 3 or more events

Answer 19

Two - π and μ, each of which appears in both equations. But we model these parameters separately The probability π of having a structural zero is modelled via a logistic regression: logit(πi) = Y0 + Y1X1i + Y2X2i + ... - here, the coefficients are labelled 'Y' to make clear they are not the same coefficients as those in the Poisson part of the model The mean μ of the counts that are not structural zeroes is modelled via a Poisson regression: log(μi) = β0 + β1X1i + β2X2i + ...

Answer 20

No - you can choose to have different predictors in each part of the model, or to use some predictors in both model parts, and other predictors only in one of them

Answer 21

Using a logistic regression

Answer 22

More zeroes - smaller count of outcome variable (corresponds to a negative coefficient in the Poisson part)

Answer 23

They are both dependent on one another - changing something in the zero-inflation part will change the estimates in the Poisson part and vice versa

Answer 24

In essentially the same way, except that the counts are assumed to follow a negative binomial distribution

Answer 25

As in an ordinary Poisson regression, but being mindful that our estimates are conditional on how we adjust for zero-inflation (e.g., if we change predictors in the ZI part, the coefficient estimates in the Poisson part will also change. For example: "Our model estimates that a 10% percentage point difference in the proportion of lower class citizens is associated with fewer police operations by a factor of 0.89, adjusting for zero-inflation, where zeroes are predicted by lower10, vendors, and population."

Answer 26

The logistic part predicts zeroes. Thus: - An OR > 1 indicates that a predictor is associated with more zeroes (fewer police operations) - An OR < 1 indicates a predictor is associated with fewer zeroes (more police operations) Interpretation needs to consider how we model the non-zero counts, i.e., the estimates from the logistic part are adjusted for the Poisson part.

Answer 27

By maximum likelihood

Answer 28

Models of the same type (e.g., negative binomial, ZIP) can be compared using LRTs But: - models without zero-inflation are not nested within zero-inflated models - although the Poisson model is nested within the negative binomial model (as it is a special case of negative binomial regression), the ordinary LRT cannot be applied

Answer 29

Yes - it is nested within a negative binomial regression with the same predictors, because if the dispersion α = 0, then the two models are identical But the ordinary LRT would give misleading results (the p-value would be too large and we would reject H0 too rarely)

Answer 30

α cannot be negative - so the test value 0 is "on the boundary of the parameter space." We therefore need to use a special test called the boundary LRT

Answer 31

H0: α = 0 Or: "There is no overdispersion" Or: "The NegBin model is no better than the Poisson model." H1: α > 0 Or: "There is overdispersion" Or: "The NegBin model is better than the Poisson model" A small p-value indicates evidence in favour of overdispersion, i.e., evidence against the Poisson model

Answer 32

In the output of a NegBin model alongside for α = 0 It is displayed by default (compares negative binomial regression with Poisson regression model with the same predictors)

Answer 33

Same as ordinary LRT, but calculates p-value in a different way

Answer 34

No - sometimes additional predictors explain away the overdispersion

Answer 35

Models without zero-inflation are not nested in models with zero-inflation We can use Akaike's Information Criterion (AIC) instead

Answer 36

Can be used to compare nested and non-nested models Is an information criterion, not a statistical test

Answer 37

AIC = -2 x LL + 2 x k Where: - LL: Log likelihood - k: number of parameters

Answer 38

Weighs model fit (represented by the LL) against parsimony (represented by k)

Answer 39

A smaller AIC indicates a better model The numeric value of the AIC has no meaning in itself; it is meaningful only when comparing the AICs of different models estimated on the same data (with the same sample size)

Answer 40

One (dispersion)

Answer 41

"adjusted for all other variables in the model, including both the Poisson and the logit parts"

Answer 42

Estimates from ZIP and ZINB models are similar, but different; different assumptions -> different results SEs are larger in the ZINB model, leading to wider CIs

Answer 43

- AIC has general applicability beyond count regression - Can be used to compare nested or non-nested models - AIC is one of several information indices: different information indices vary in the way they weight model fit and parsimony

Answer 44

- Excess zeroes can sometimes be accounted for by explanatory variables - Poisson and negative binomial models are not nested in their zero-inflated counterparts - Use AIC (or other information criteria) for comparison

Models for Count Data II Flashcards

(69 cards)