R commands Flashcards

Question 1

Q

Possible model fitting for binary grouped data

Answer

A

glm(cbind(successes, insuccesses)~x, family=binomial(link=*))
- * = logit canonical by default, probit, cloglog;
- Different links comparable with AIC only;
- summary(fit): βs give the effect on log-odds, have to exp(β) to get the effect on odds.

Question 2

Q

Possible model for binary ungrouped data

Answer

A

glm(cbind(y, n-y)~x, data=d, family=binomial(link=))
glm(y~x, data=d, family=binomial(link=))
- Plots make no sense, we would have two lines (one per option);
- Deviance test is unreliable;
- Same coefficients as grouped data;
- Much larger deviances, DFs and AICs;
- Do not compare AIC with grouped data’s;
- Different overdispersion than grouped and different β’s SE;
- Can use test=”F”in anova().

Question 3

Q

Overdispersion fitting

Answer

A

glm(y~x, family=quasi____(link=*))
- No AIC;
- t values instead of Z;
- φ estimated with Pearson statistics (not deviance!);
- Can use test=”F” in anova test.

Question 4

Q

Goodness of fit for grouped data GLMMs

Answer

A

Only if enough n_i for each group! Results might differ between the two!
Deviance test, for total model (predictors > 1):
qchisq(0.05, summary(fit)$df.residual, lower.tail=F)
- If lower than ResDev, reject H0
pchisq(deviance(fit), summary(fit)$df.residual, lower.tail=F)
Pearson test:
pchisq(sum(residuals(fit, type=”pearson”)^2), summary(fit)$df.residual, lower.tail=F)

Question 5

Q

Goodness of fit for ungrouped data

Answer

A

Hosmer-Lemeshow test:
(hl = hoslem(d$y==1, predict(fit, type=”response”))

Question 6

Q

Single predictor testing

Answer

A

anova(fit, test=*)
- * = “Chisq”, “LRT”, “Rao”;
- * = “F” only if quasilikelihood approach or Gamma regression;
- One test per predictor, added sequentially.

Question 7

Q

Model comparison for nested models

Answer

A

anova(fitmin, fitmax, test=*)
- Not for different link functions;
- *=”F” if overdispersion estimation or Gamma regression.

Question 8

Q

Model comparison for non-nested models

Answer

A

AIC(fit)
- NOT for grouped vs ungrouped comparisons because the scale of y is different;
- For different link functions too;

Question 9

Q

Predictions for each group

Answer

A

predict(fit, newdata=NULL, type=*)
- * = response gives the parameter, link gives exp(βs).

Question 10

Q

Possible model fitting for count data

Answer

A

glm(y~offset(log(delta))+x, data=d, family=poisson(link=*))
* = log default, identity, sqrt
- exp(Intercept) = average number of y;
- If we have an offset, exp(coef) are the effects on the rate of y, baseline count = delta * exp(β).

Question 11

Q

Offset testing for count data

Answer

A

H0: log(delta)=1
fit0 = glm(y~log(delta)+x, family=poisson)
C = rep(0, p)
C[2] = 1
d = 1
lht(fit0, C, rhd=d)

Question 12

Q

Change in reference category

Answer

A

new.x = relevel(d$x1, “newref”)
contrasts(d$x) = contr.treatment(#levels, base=#ofnewreflevel)
- Deviances stay the same;
- β’s significance might change.

Question 13

Q

Possible model fitting for nonnegative data

Answer

A

glm(y~x, data=d, family=Gamma(link=”log”)
- Overdispersion is always estimated via Pearson statistics;
- The statistics are now t (and not Z);
- If log is not specified, canonical one is used and signs are flipped;
- Can now use fit=”F” in anova();
- Never compare AICs with a logN model.

Question 14

Q

Unordered categories modelling

Answer

A

library(mnet)
multinom(y~x, d)
- Coefficients and std errors for every c-1 category and covariate;
- exp(coef) is the increase/decrease in odds of being in that category relative to the reference one;

Question 15

Q

Nested model comparison for categorical regression

Answer

A

dev = fitmin$dev - fitmax$dev
df = fitmax$edf - fitmin$edf
1-pchisq(dev, df)

Question 16

Q

Ordered non-sequential categories modelling

Answer

Study These Flashcards

A

polr(y~x, data=d)
- Parametrization gives coefficients η = -β;
- Thresholds for passing categories (latent variable) are “intercepts”;
- exp(β) < 1: we favor lower categories, effect on the odds;
- exp(β*10): effect on original scale of increasing x by 10.

Question 17

Q

Possible model fitting for Generalized Linear Mixed Models

Answer

Study These Flashcards

A

fit = glmer(y~x+(1|group), data=d)
fit = glmer(y~x+(1+x|group), data=d)
fit = glmer(y~x+(1+x||group), data=d) if we want uncorrelated gammas
- Fixed effects βs are on the linear predictor’s scale;
- Correlation is the rescaled Fisher matrix;

Question 18

Q

Model comparison for Generalized Linear Mixed Models

Answer

Study These Flashcards

A

AIC(fit)
- Even for GLMs and GLMMs.
anova(fit1, fit2, test=”Chisq”)
- Only between same type of models.

Question 19

Q

Possible model fitting for Generalized Additive Models

Answer

Study These Flashcards

A

fit = gam(y~s(x, bs=”tp”), data=d)
fit = gam(y~te(x1, x2), data=d) if we use bivariate, we use tensors
- Parametric coefficients (intercept) unless -1 in formula;
- edf is how many parameters I’m using to estimate the polynomial of non-linear effect
- plot(fit, shift=coef(fit)[1]) if we want to include the intercept too;
- Trust only highly data-dense zones!

R commands Flashcards

(19 cards)