Logistic regression Flashcards

(24 cards)

1
Q

Why is logistic regression different to other lectures?

A

Previous lectures will examine predicting variance in a continuous, normally distributed dependent variable. i.e. Linear regression.

This is very common, but it is not the only type outcome we are interested in, somethings cannot be measured in that way particularly in the medical world
Alive vs. dead
Addicted vs. non-addicted
Relapse vs. non-relapse

This use of absolute categories is extremely common in clinical psychology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Difference between logistic and linear regression

A

Linear regression predicts a continuous outcome, while logistic regression predicts a categorical outcome (usually a probability).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Logistic regression

A

Logistic regression is a data analysis technique that uses mathematics to find the relationships between two data factors. It then uses this relationship to predict the value of one of those factors based on the other. The prediction usually has a finite number of outcomes, like yes or no.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Categorical outcomes

A

Analysed with logistic regression based methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Example of categorical vs continuous outcomes in depression

A

Categorical - depressed vs not depressed

Continuous - scores on BDI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Advantages of continuous outcomes

A

Inferences can be made with fewer data points - continuous outcome so less pp needed for power
Continuous data often provides more statistical power compared to binary data, making it easier to detect meaningful differences between groups.

Higher sensitivity

More variety in analysis options - can look at people who are half way or a bit depressed, severely etc - whole continuum

Information on variability of a construct within a population

Give a better understanding of the variable in question.

Nonsensical distinctions avoided - binary outcomes can lead to distinctions that make no sense at all - somebody with a score of 5 could be treated in the same way as someone with a score of 9 abut different to score of 4 - issue with dichotomising things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why should we use binary outcomes

A

Increased scores on a scale does not always mean depressed/addicted etc. The whole sample could not be depressed, their scores could just go up a bit

Has clinical relevance if we use diagnostic criteria to give formal diagnoses. Have people who meet diagnosis threshold and then measure again after intervention to see if they still meet the threshold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Summary of categorical outcomes

A

Categorical data has its limitations and lacks sensitivity

BUT allow us to make decisions in relation to clinical outcomes, or what we decide is a relevant effect (doesn’t have to be a clinical diagnosis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Logistic regression summary

A

We use logistic regression to explore what variables are associated with an outcome

This gives us model fit statistics (similar to a linear regression)

Regression coefficients for individual predictors (similar to a linear regression)

Odds Ratio’s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Odd’s ratios

A

These explain the % change in the DV attributable to a unit change in an IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does logistic regression allow us to do that linear does not?

A

To make absolute conclusions about concepts

e.g. relapse - we could report increased frequency of drug use but to make absolute conclusions we need to be able to say whether it specifically causes relapse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does logistic regression do

A

Predicts membership of a group

It is called “binary” logistic regression as that refers to a dichotomous outcome - TWO POSSIBLE OUTCOMES e.g. Relapse = 1, non-relapse = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Log-likelihood

A

How likely is a model to predict that someone is in the correct group

Looks at the discrepancy between what was observed and what the model predicted

Participants observed value for the outcome (0/1) and their predicted value (which will range from 0 – certainly will not happen, and 1 certainly will happen), these discrepancies between observed and predicted is summed across all participants. Its counterpart in linear regression would be the sum of squares error (how far each observation is from the prediction).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the logistic regression compare results to

A

A baseline model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What variation of R2 is used for logistic regression

A

McFadden’s R2 - a measure of how well the model fits the data compared to the null/baseline model. Compared the likelihood of the full model to the null model.

A higher McFadden = a better fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A confusion matrix

A

The number of participants your model correctly classifies

Expressed as a %

We will use the content of the confusion matrix to tell us the % of cases that were correctly classified

17
Q

What stats are used to report predictors in a logistic regression

A

The regression coefficient (b) and its SE and p value

This gives you the direction of an association and the variability in this association

A positive coefficient means high scores are associated with the group labelled as one, negative coefficient means high scores associated with the group labelled as 0

Exp(B), this an Odds Ratio

Its called Exp(B) because its an exponentialized regression coefficient

18
Q

Odds ratios and their meaning

A

OR of 1 = no change in likelihood of event

OR of .5 = 50% decrease in likelihood of event

OR of 1.5 = 50% increase in likelihood of event

OR of 4.7 = 370% increase

19
Q

Logistic regression assumptions

A

DV is categorical with two levels only (hence binary 0/1)

One of the DV “events” should not be rare
E.g. 2 people getting a first, 548 not getting a first
This causes a problem called “separation” where you get “perfect” predictors - the best guess model is going to be extremely accurate and hard to beat it with predictors

IVs continuous (ratio/interval) or categorical.

No multicollinearity- can assess with VIF

20
Q

Logistic regression model fit statistics reported

A

Chi Squared value (df) = , p= , pseudo R2 (McFadden)

21
Q

Where are the odds ratios on the output

22
Q

How to write up a logistic regression model

A

A logistic regression model was conducted to exam whether individuals were depressed or not. The following predictors were added to the model…

The model significantly predicted … correctly identifying % of cases
X2() = , p = , McFadden’s R2=

23
Q

How to write up predictors

A

Was significantly positively/negatively associated with .. B = , SE= , p= , OR = , 95% CI to

24
Q

Bayes factor