Linear and logistic regression Flashcards

1
Q

What is regression

A

It is a statistical method for estimating the numerical relationship between one dependent variable and one or more independent variables

It can be r used to address a wide variety of research questions involving a dependency relationship between one or more variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are dependent or independent variables

A

Variables

Independent (predictors/ control)

Dependent (outcomes/ measure)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Regression models

A

The type of model is determined by the dependent variable and the research question

Most common types
- linear regression
- logistics regression

Others
- Cox regression
- ordered logistic regression
- multinominal regression
- poison regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data types

A

Can be broadly categories into two groups

  1. categorical (qualitative)
    - nominal
    - binary
    - ordinal
  2. Numeric
    - discrete
    - continuous
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is linear regression

A

It summarises the observed data by using a line equation that best fit the data to describe the dependency relationship between the dependent (y) and independent variable (x)

Interpretation
- when linear regression models are presented in publications, the statistic usually quoted is the beta coefficient

The beta coefficient measures the increase/ decrease in the dependent variable for each unit increase in the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Assumptions of linear regression

A
  • independence (the observations are independent)
  • linearity (the relationship between (y) and (x) is linear)
  • normality (the residuals are normally distributed)
  • homoscedasticity (the residuals have constant variance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are residuals

A

They are the difference between the predicated value obtained from the model and observed value of the dependent variable.

A good model will have small residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Multi variable regression

A

Sometimes we are interested in including more than one predictors variable in the same regression model.

This is usually done because we want to control for the fact that sometimes a variable it’s related to both the exposure and the outcomes of interest (‘cofounder’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is logistic regression

A

Logistic regression is used when the outcome is binary (I.e disease/ disease free)

^ we want our regression model to estimate the probability p of the outcomes occurring (I.e the probability of having disease X)

Odds ratio is usually reported instead of beta coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly