W5: GLM 2 Flashcards

1
Q

What is the y variable for poisson distribution?

A

Discrete numeric, whole, and positive numbers
No negative integers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of distribution should be used for this RQ:
“Examining risk factors for the number of accidents someone gets into over a 12 month period”

A

Poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of distribution should be used for this RQ:
“Evaluating whether an intervention reduced the number of times someone
missed their medication in the last month”

A

Poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What type of distribution should be used for this RQ:
“Testing whether the total number of health care appointments over six months can be lowered by treating mental health”

A

Poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How many parameters does the Poisson distribution have and what are they called?

A

1 parameter: Lambda
Both the mean AND variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the Poisson distribution look like when lambda gets higher (e.g when lambda = 10)?

A

More like a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What assumption is violated for both Poisson and logistic regression?

A

Normality assumption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why can’t we use linear regression for count outcomes, and Poisson instead?

A

Straight line is bad fit for only positive outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the link function for Poisson distribution and what does it do?

A

Natural log (ln (lambda)
Transforms eta so it never goes below 0
Unbounds lambda on the left side (y axis) of the graph
Log of 0, ln(0) = negative infinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

After link transformation, what does the data fall between for Poisson and logistic distribution?

A

Negative infinity to positive infinity
i.e continuous unbounded outcome to apply to linear model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the variance if lambda is 0?

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the inverse link function do for Poisson and logistic distribution?

A

Poisson: y axis (left side of graph) falls back to the original count scale (between 0 and 1)
Logistic: y axis falls back to probability scale (between 0 and 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 assumptions of Poisson and logistic regression?

A
  1. Errors must be independent
  2. Assumes linear relationship on the link (natural log / logit) scale
  3. Requires large sample size (no dfs, so it’s for parameters to be normally distributed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What argument must be added to glm() and testDistribution() for Poisson and logistic regression?

A

glm( y ~ x, data = d, family = poisson() )
or family = binomial()
testDistribution( d$awards, distr = “poisson”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you interpret the estimate for the predictor using Poisson regression?
glm ( num_awards ~ math)
Each 1 unit higher math score is associated with x high…

A

log awards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Instead of interpreting Poisson regressions on log scale, what should we use instead and how do we get that?

A

Incident rate ratios (IRRs) by exponentiating regression coefficients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What do IRRs indicate (Poisson)?

A

How many more times y will be for 1 unit change in x
i.e the ratio of how much y is expected to change in count numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If IRR = 4, base rate = 2, how many more times will outcome be for 1 unit change in predictor?

A

4 * 2 = 8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does it mean if IRR or OR = 1?
What would the coeff value be on link (log / log odds) scale?

A

There is no change in number of times the outcome will be (1 x 1 base rate = 1)
or no change in number of time the odds of outcome (1 x 1 base odds = 1)
Coeff of 1 on IRR or OR scale = coeff of 0 on link scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What 3 things should you not exponentiate?

A

p-values, z values, standard errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are 2 things you can exponentiate for poisson regression?

A

regression coefficients : exp(coef)
confidence intervals : exp(confint)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What argument do you have to add to visreg when you want to plot poisson or binary logistic regression on the original scale?

A

scale = “response”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the y outcome for binary logistic regression?

A

0 or 1

24
Q

What type of regression do you use for this RQ:
What predicts whether someone will have major depression or not?

A

Binary logistic

25
Q

What type of regression do you use for this RQ:
Does one treatment have a higher probability of patients remitting from major depression than another treatment?

A

Binary logistic

26
Q

What type of regression do you use for this RQ:
What is the probability that a patient will be readmitted to the hospital within 30 days of discharge?

A

Binary logistic

27
Q

What type of regression do you use for this RQ:
What predicts whether an individual will live or die before age 60?

A

Binary logitic

28
Q

What type of regression do you use for this RQ:
If a bank gives a loan to someone, what is their probability of not being able to pay it back?

A

Binary logistic

29
Q

What type of regression do you use for this RQ:
Do older adults have a higher probability of using CAM than younger adults?

A

Binary Logistic

30
Q

What is the link function for logistic regression and what does it do?

A

Logit function
Transforms et so it never goes below 0 or above 1
Unbounds on both left and right side of graph

31
Q

What distribution do logistic regressions follow and what is its parameters?

A

Bernoulli distribution
1 parameter (average probability that the event will occur i.e p or mu)

32
Q

How many parameters do both Poisson and Bernoulli distributions have?

A

1

33
Q

Separation is a problem that can arise from logistic regression. What does it mean?

A

When predictor perfectly predicts outcome / separate the outcome
E.g 0% that D appears in people without PTSD

34
Q

Under what 2 situations would the issue of separation most often occur?

A
  1. When the outcome is rare
  2. When there is a small sample size
35
Q

How do you resolve the issue of separation for logistic regression?

A

Remove predictors / collapse groups

36
Q

R stores variables with few levels (e.g 0 or 1) as continuous.
What does the argument strict = FALSE in egltable() function do?

A

Should be treated as categorical variable

37
Q

What does a significant chi-square test from egltable() output indicate for x and y?

A

x and y are not independent

38
Q

How do you interpret the estimate for the predictor using Poisson regression?
glm ( stress_high ~ SE)
Each 1 unit higher SE is associated with x high…

A

log odds of being in high stress

39
Q

Instead of interpreting logitic regressions on logit scale, what should we use instead and how do we get that?

A

Odd ratios (ORs) by exponentiating regression coefficients

40
Q

What do ORs indicate (logistic)?

A

How many more times the odds of occurring the outcome will be for 1 unit change in predictor

41
Q

If OR = 2, base odds = 0.9, 1 unit higher = ?

A

0.9 * 2 = 1.8 times the odds of outcome

42
Q

Instead of using ORs, what should we convert the log odds scale to?

A

Probability scale

43
Q

How do you determine 0 or 1 on visreg graph with y-axis (predicted probabilities) ranging from 0 to 1?

A

Above 0.5 = 1 / yes
Below 0.5 = 0 / no

44
Q

What function do you use to convert ORs to probabilities? Output as a table of probabilities.

A

predict( mlog, type = “response”)

45
Q

Different values of predictors have different probabilities. E.g SE score of 1 has much higher probability of being in high stress group than SE score of 4. What is this effect called?

A

Marginal effect
Instantaneous effect of change at 1 particular point on x scale
AKA tangent line / derivative of slope

46
Q

What is the average marginal effect (AME) of probabilities?

A

What the average change in probability would be in outcome for 1 unit change in predictor

47
Q

How do you calculate AME?

A

Calculate mean of the divided difference between original and new (added constant (h) ) probabilities by original constant (h)

48
Q

Do all predictors influence the outcome for multiple regression?

A

Yes

49
Q

What is the logit link function?
And what do they unbound?

A

ln(mu / 1- mu) / g(mu)
* ln part unbounds left side (never goes below 0), neg inifinity
* mu / 1- mu unbounds right side (never goes above 1), pos infinity

50
Q

ORs higher than 1 = pos / neg relationship between your variables?

A

Positive relationship

51
Q

ORs lower than 1 = pos / neg relationship between your variables?

A

Negative relationship

52
Q

What are odds?

A

Probability of something happening / probability of something not happening.
E.g rolling a 6 on a 6 sided dice = 1/6 divided by 5/6

53
Q

When do you calculate AME?

A

For continuous predictors in binary/logistic regression when you see an instantaneous change in outcome at a specific x value
E.g when the probability of being in high stress suddenly goes down between SE score of 3 and 4

54
Q

What are deviance residuals?

A

Individual contribution of each observation to the overall model deviance.

55
Q

A negative deviance residual represents what?

A

On average, observed outcome is lower than model predicted outcome

56
Q

A positive deviance residual represents what?

A

On average, observed outcome is higher than model predicted outcome