# Logistic Regression, Probability, and Odds Flashcards

1
Q

What is regression?

A

Mathematical equation for straight line that is fitted to data as way of describing relationship between two or more variables.

2
Q

What are the main purposes of regression?

A

Prediction (to estimate risk).

Control for confounding.

3
Q

What does the regression line do?

A

Estimates average values for variable on vertical scale (Y) according to values on horizontal scale (X).

4
Q

What happens in simple regression?

A

For each change in X, there is a change in Y.

5
Q

What is general linear regression?

A

Refers to conventional linear regression models for a continuous response variable given continuous and/or categorical predictors.

6
Q

What happens with transformation in a GLM?

A

Without transformation, the coefficient will estimate rate differences.

With transformation, the coefficient will estimate rate ratios.

7
Q

What is the model for simple regression?

A

Yx=(D | X=x)= α+βx

Y = α + βX + ε (prediction)

Yi = BO + B1Xi + Ei (population)

8
Q

What is a?

A

Measure when x = 0
Also B0
The y-intercept

9
Q

What does slope coefficient do?

A

Estimate increase in risk per unit increase in X.

B1

10
Q

How are mean and probability related?

A

Mean is estimate of probability.

11
Q

Why will linear regression not work?

A

Probability is scale of 0-1 and regression is not within those constraints.

Linear model assumes risk changed linearly.

12
Q

Why will logistic regression work?

A

Used for binary outcomes and on 0-1 scale showing odds.

This is because the probability curve becomes linear after taking natural log of a value.

13
Q

What is the model form of logistic regression?

A

Logit (Y) = ln (Y/(1-Y))

OR

Logit [P (Y)] = ln ((P(Y))/(1-P(Y)))

OR

ln (Px/1-Px) = ln (D|X=x) = a +Bx

14
Q

What is alpha?

A

Log odds of outcome when X = 0

The y-intercept.

15
Q

What is beta?

A

Slope

Log odds RATIO comparing two exposure groups differing by one unit on scale of x.

16
Q

In simple logistic regression what do 1 and 0 represent?

A

1 = exposed, 0 = not exposed.

17
Q

Simple logistic regression, what would 1 (exposure) yield?

A

Alpha + beta = log odds of exposure

e ^(a+b) is the odds of disease with exposure.

18
Q

Simple logistic regression, what would 0 (no exposure) yield?

A

Alpha ONLY. Log odds of outcomes when not exposed.

e^(a) is the odds of disease without exposure.

19
Q

How would you get ODDS from log odds/?

A

Exponentiate both sides:

e ^ (a+b) when exposed
e ^ (a) when unexposed

e is a constant
e = 2.718

20
Q

How would we find odds ratio?

A

Odds exposed v unexposed:
Odds(a+b)/odds(a)

Therefore:
e^(a+b-a)

Therefore:

e^b

Odds ratio is e^b

21
Q

What happens when you have more than one categorical predictor?

A

Create dummy variables with a reference group.

22
Q

How do you choose reference group?

A

Generally the one with the largest sample size so estiamtes with be reasonably precise and stable.

23
Q

What is model for this?

A

logit(p) = a +b1 + b2

24
Q

What is alpha in this model?

A

Log odds of disease for reference group.

25
Q

What is b1?

A

Log odds of disease for group 1 versus reference group.

26
Q

What is b2?

A

Log odds of disease for group 2 versus reference group.

27
Q

What is model for continuous predictor?

A

Logit (p ) = alpha + betax

28
Q

What is alpha?

A

Log odds when X = 0

29
Q

What is beta?

A

Log OR with 1 unit change in X.

30
Q

How would you calculate odds when X = to a different number?

A

Multiply that number by BETA and add to ALPHA.

31
Q

What is the Wald hypothesis?

A

Test of exposed versus unexposed.

Null is that OR = 1 (that e^beta = 1); no difference between the two.

H0: beta = 0
Ha: beta does not equal 0

32
Q

What are advantages to logistic regression?

A

Obtain estimates and CI for OR (crude and adjusted)

Can handle multiple independent variables.

33
Q

What are disadvantages to logistic regression?

A

Implicit assumptions difficult to check.

Model selection difficult.