Binary outcome Flashcards

1
Q

what is the purpose of logistic regression

A
  • to classify samples
  • obese vs not obese
  • true vs false
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s the difference between a simple vs complicated model in logistic regression?

A
  • Simple model: can predict binary outcome using single PV (weight predicts outcome  obese vs not obese)
  • Complicated: use more than 1 (weight + genotype + age predicts outcome  obese vs not obese)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

for logistic regression does the PV also need to be binary? What about linear regression?

A
  • No can use continuous and discrete data to predict binaristic outcome.
  • Same goes for linear regression, the only difference really is that the outcome is continuous not binary
  • WHAT DETRMINS WHICH IS USED DEPENDS ONLY ON THE OUTCOME VARIABLE
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how do we know if each variable is usefully contributing to the model?

A
  • if the variable’s prediction is significantly different from 0 then its useful to the model
  • Use Walds test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In linear regression - we have the concept of the residual, why does logistic regression not have this?

A
  • All the values don’t deviate from the line too much, see below
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does logistic regression use to calculate the fit of a model

A
  • Maximum likelihood (curve)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do we find the line with the maximum likelihood

A
  • First pick a probability (curve) that estimimates the outcome for different values of weight.
  • Then use this curve to predict the likelihood of observing an obese vs non obese mouse for each value
  • Then multiply all those likelihoods together = the likelihood of the data GIVEN this curve
  • Do this for lots of different lines, each gives you the total likelihood
  • The curve with the maximum likelihood is selected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is it innaprioriate to use linear regression when you have a binary outcome?

A

becasue the model will predict not only 0 and 1 outcomes but values between 0 and 1 e.g., 0.6.

This will produce large residuals which is bad becasue the residuals are whats used to do the fitting. large residuals will bias the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the equation of the logistic curve (S; sigomidal)

A

1/1+e(c + bx)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

odds?

A

can use the prediction in the logistic regression equation to compute the odds

Probability of event happening divided by the probability of the event not
Happening

This is the same thing as euleurs number raised to the power of the systematic component

(e(c+bX))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Log odds or logit

A

simply the natural logs of the equation

The log odds transofrm the equation into a linear one. Got rid of Euleurs number.

Log odds vary between negative infinity and infinity as the probability moves from 0 to 1. Log odds are linearly related to the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Imagine we have the logit but want the odds. How do we calculate the odds?

A

e(logit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

imagine we have the odds and want the probability outcome of there being a case or not. How do we calculate this?

A

Prediction = odds/1+odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does the logit tell us

A

the linear impact of a PV on the DV

with a score 55% tp 56% int he PV, there is an increase in the logit by x

Same amoubt of increase if we were looking at the difference between a score of 64% and 65%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

if we are looking at odds of a value of 55%- 56% is the amount it changes equal to a difference between 64% and 65%?

A

No, bc there isnt a linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

if we are looking at the probability of a value of 55%- 56% is the amount it changes equal to a difference between 64% and 65%?

A

no, bc there isnt a linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Odds ratio

A

calculated by dividing odds from point B with point A.

Odds at 55% attendance/ odds at 54% attendance. Odds for successive odds remains constant.

Gives indication of treatment effect. Tells us the relative increase in odds as you increase the IV by 1 unit

Example: 13 minutes adherence = odds of 0.2551

Odds ratio = 1.2190

Therefore 14 minutes of adherence = 0.2551 * 1.2190 = 0.3110

Then you can apply this to different contexts. Imagine you fit a logistic regression to a sample and get an odds ratio of 0.1 of people getting or not getting a disease. Can conclude for every minute n adhered to treatment they decreased their risk of getting disease by 10%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

LEC: risk

A

number of n with event/total population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Relative risk

A

risk in group of interest (n with event/ total n in group A)
/
risk in reference group (n with event/total n in group B)

20
Q

Risk difference

A

risk in group of interest - risk in reference

21
Q

odds

A

number of n who have event / number of n who don’t have an event

22
Q

Odds ratio

A

odds in group of interst / odds in reference group

(n with event/n without event; TREATMENT GROUP) / (n with event/n without event; reference group)

23
Q

interpret RR or odds ratio of:
1
1>
1<

A
  • 1 no association between exposure and control
  • 1>  risk/odds of outcome is greater in the exposed group
  • 1<  risk/odds of outcome smaller in the exposed group
24
Q

what is the relationship between the RR and OR if the event is rare vs frequent

A

If the outcome is rare, these values will be more similar. If the outcome is frequent, they are not similar

25
Q

with binary outcomes, what tests can we use to test for differences in the event/non events between the intervetnion and control

A

 Chi-squared test

26
Q

if we have a small numbers what correction do we apply to the chi-squared test?

A

Yate’s correction for continnuity

27
Q

how does fisher’s exact test work?

A

Creates a contingency table for all the possible values in cells that the row and column totals could be the same

Then determines the probability of observing each table if the null was true (chance occurences)

Then the sum of the probabilities of the tables that are equal to or more extreme than the observed table = the p value.

28
Q

with binary outcomes, what do we use to test for differences in the event/non events between the intervetnion and control if numbers are small (less than 5 events in any cell)

A

 Fisher’s exact test

29
Q

what is typically used to test the difference in dverse events between the intervention and control group

A

fisher’s exact test

30
Q

In stata, what are we looking for in chi squared test to tell us about the difference in each group getting a case or not

A
  • Risk difference
  • Relative risk/risk ratio
  • Odds ratio
  • Chi squared result
  • P value
31
Q

what are the assumptions that need to be met prior conducting logistic regression

A
  • Don’t assume variables in model are normally distributed
  • Outcomes are independent – whether or not person 1 is a case has no effect on whether person 2 is
32
Q

What things represents the treatment effect in logistic regression?

A
  • Odds ratio
  • Log odds ratio
33
Q

Interpret the odds ratio of 4.03

A

The odds of having an event is larger in the treatment group by a magnitude of 4.03

34
Q

why might we get a odds ratio when using chi-squared vs logistic regression?

A

because chi-squared does not adjust for baseline covariates while logistic regression does

35
Q

if the odds ratio for chi-squared test and logistic regression is the same what does that mean

A

the variables adjusted for had no effect on the outcome

36
Q

what is the coefficient of the model in logistic regression?

A

The log(odds ratio)

the model coefficient represents the change in the log-odds of the outcome variable associated with a one-unit change in the predictor variable, holding all other predictor variables constant.

The log-odds is the natural logarithm of the odds, which is the probability of an event occurring divided by the probability of the event not occurring.

The log-odds can take on any value from negative infinity to positive infinity, with positive values indicating higher odds of the event occurring and negative values indicating lower odds.

So, when the coefficient of the model is the log-odds ratio, it tells us how the odds of the outcome variable change with a one-unit increase in the predictor variable. A positive coefficient means that the odds of the outcome variable increase as the predictor variable increases, while a negative coefficient means that the odds of the outcome variable decrease as the predictor variable increases.

37
Q

How do we get the odds ratio and the CI in logistic regression?

A

take the exponent of the model coefficient and the exponent of the limits of the 95% CI of model coefficient to get the odds ratio and its CI.

38
Q

what is the coefficient in log-binomial regression?

A

log(risk ratio)

39
Q

how do we get the relative risk and its CI in log-binomial regression?

A

take the exponent of the coefficient and the exponent of the limits of the 95% CI

40
Q

Interpret the odds ratio of 4.03

A

The odds of having an event is larger in the treatment group by a magnitude of 4.03

41
Q

interpret: chi squared test

Relative risk/Risk ratio [95% CI]: 1.40 [1.20 to 1.64]

interpret this

A

The risk of losing ≥5% of
initial weight by 12 months is 40% higher in the intervention (support) group than in the
advice group.

42
Q

interpret: Chi squared test,

Odds ratio [95% CI]: 1.59 [1.29 to 1.96]

A

The odds of losing at least 5% of initial weight by 12 months is higher in the intervention (support) group than in the advice group by a factor of 1.6.

43
Q

interpret: logistic regression

Odds Ratio [95% CI]: 1.60 [1.29 to 1.97]; P value <0.001

A

A significant adjusted odds
ratio in favour of the support arm was found, indicating that participants had an
increased odds of 1.596 (or 1.6) of losing at least 5% of initial weight in the Support
group compared to the Advice group

44
Q

interpret: log-binomial regression

Risk Ratio [95% CI]: 1.41 [1.21 to 1.64]; P value <0.001

A

A significant adjusted risk
ratio in favour of the support arm was found, indicating that participants had a 41%
increase in the risk of losing at least 5% of initial weight in the Support group
compared to the Advice group after adjusting for gender and baseline weight.

45
Q

binary outcome, what are the unadjusted tests and adjusted tests we use?

A

unadjusted:

> chi squared
fisher’s exact test

non adjusted

> logistic regression
log-binomial regression

46
Q

binary outcome, what values are used to determine treatment effect

A

odds ratio (logistic regression)

relative risk/risk ratio (log-binomial regression)