Logistic regression Flashcards

Question 1

Q

Why is logistic regression different to other lectures?

Answer

A

Previous lectures will examine predicting variance in a continuous, normally distributed dependent variable. i.e. Linear regression.

This is very common, but it is not the only type outcome we are interested in, somethings cannot be measured in that way particularly in the medical world
Alive vs. dead
Addicted vs. non-addicted
Relapse vs. non-relapse

This use of absolute categories is extremely common in clinical psychology

Question 2

Q

Difference between logistic and linear regression

Answer

A

Linear regression predicts a continuous outcome, while logistic regression predicts a categorical outcome (usually a probability).

Question 3

Q

Logistic regression

Answer

A

Logistic regression is a data analysis technique that uses mathematics to find the relationships between two data factors. It then uses this relationship to predict the value of one of those factors based on the other. The prediction usually has a finite number of outcomes, like yes or no.

Question 4

Q

Categorical outcomes

Answer

A

Analysed with logistic regression based methods

Question 5

Q

Example of categorical vs continuous outcomes in depression

Answer

A

Categorical - depressed vs not depressed

Continuous - scores on BDI

Question 6

Q

Advantages of continuous outcomes

Answer

A

Inferences can be made with fewer data points - continuous outcome so less pp needed for power
Continuous data often provides more statistical power compared to binary data, making it easier to detect meaningful differences between groups.

Higher sensitivity

More variety in analysis options - can look at people who are half way or a bit depressed, severely etc - whole continuum

Information on variability of a construct within a population

Give a better understanding of the variable in question.

Nonsensical distinctions avoided - binary outcomes can lead to distinctions that make no sense at all - somebody with a score of 5 could be treated in the same way as someone with a score of 9 abut different to score of 4 - issue with dichotomising things

Question 7

Q

Why should we use binary outcomes

Answer

A

Increased scores on a scale does not always mean depressed/addicted etc. The whole sample could not be depressed, their scores could just go up a bit

Has clinical relevance if we use diagnostic criteria to give formal diagnoses. Have people who meet diagnosis threshold and then measure again after intervention to see if they still meet the threshold

Question 8

Q

Summary of categorical outcomes

Answer

A

Categorical data has its limitations and lacks sensitivity

BUT allow us to make decisions in relation to clinical outcomes, or what we decide is a relevant effect (doesn’t have to be a clinical diagnosis)

Question 9

Q

Logistic regression summary

Answer

A

We use logistic regression to explore what variables are associated with an outcome

This gives us model fit statistics (similar to a linear regression)

Regression coefficients for individual predictors (similar to a linear regression)

Odds Ratio’s

Question 10

Q

Odd’s ratios

Answer

A

These explain the % change in the DV attributable to a unit change in an IV

Question 11

Q

What does logistic regression allow us to do that linear does not?

Answer

A

To make absolute conclusions about concepts

e.g. relapse - we could report increased frequency of drug use but to make absolute conclusions we need to be able to say whether it specifically causes relapse.

Question 12

Q

What does logistic regression do

Answer

A

Predicts membership of a group

It is called “binary” logistic regression as that refers to a dichotomous outcome - TWO POSSIBLE OUTCOMES e.g. Relapse = 1, non-relapse = 0

Question 13

Q

Log-likelihood

Answer

A

How likely is a model to predict that someone is in the correct group

Looks at the discrepancy between what was observed and what the model predicted

Participants observed value for the outcome (0/1) and their predicted value (which will range from 0 – certainly will not happen, and 1 certainly will happen), these discrepancies between observed and predicted is summed across all participants. Its counterpart in linear regression would be the sum of squares error (how far each observation is from the prediction).

Question 14

Q

What does the logistic regression compare results to

Answer

A

A baseline model

Question 15

Q

What variation of R2 is used for logistic regression

Answer

A

McFadden’s R2 - a measure of how well the model fits the data compared to the null/baseline model. Compared the likelihood of the full model to the null model.

A higher McFadden = a better fit.

Question 16

Q

A confusion matrix

Answer

Study These Flashcards

A

The number of participants your model correctly classifies

Expressed as a %

We will use the content of the confusion matrix to tell us the % of cases that were correctly classified

Question 17

Q

What stats are used to report predictors in a logistic regression

Answer

Study These Flashcards

A

The regression coefficient (b) and its SE and p value

This gives you the direction of an association and the variability in this association

A positive coefficient means high scores are associated with the group labelled as one, negative coefficient means high scores associated with the group labelled as 0

Exp(B), this an Odds Ratio

Its called Exp(B) because its an exponentialized regression coefficient

Question 18

Q

Odds ratios and their meaning

Answer

Study These Flashcards

A

OR of 1 = no change in likelihood of event

OR of .5 = 50% decrease in likelihood of event

OR of 1.5 = 50% increase in likelihood of event

OR of 4.7 = 370% increase

Question 19

Q

Logistic regression assumptions

Answer

Study These Flashcards

A

DV is categorical with two levels only (hence binary 0/1)

One of the DV “events” should not be rare
E.g. 2 people getting a first, 548 not getting a first
This causes a problem called “separation” where you get “perfect” predictors - the best guess model is going to be extremely accurate and hard to beat it with predictors

IVs continuous (ratio/interval) or categorical.

No multicollinearity- can assess with VIF

Question 20

Q

Logistic regression model fit statistics reported

Answer

Study These Flashcards

A

Chi Squared value (df) = , p= , pseudo R2 (McFadden)

Question 21

Q

Where are the odds ratios on the output

Answer

Study These Flashcards

A

exp(Est. )

Question 22

Q

How to write up a logistic regression model

Answer

Study These Flashcards

A

A logistic regression model was conducted to exam whether individuals were depressed or not. The following predictors were added to the model…

The model significantly predicted … correctly identifying % of cases
X2() = , p = , McFadden’s R2=

Question 23

Q

How to write up predictors

Answer

Study These Flashcards

A

Was significantly positively/negatively associated with .. B = , SE= , p= , OR = , 95% CI to

Question 24

Q

Bayes factor

Answer

Study These Flashcards

A

Logistic regression Flashcards

(24 cards)