lecture 9 Flashcards

1
Q

Associations between two categorical variables

A

Contingency tables are often used to record and analyse the relationship between two or more categorical variables.

Recall, “Categorical” means that the data can be separated into mutually exclusive categories. Usually summarized in the form of count.

Examples: smoker/non-smoker, treated/control, disease/disease free.

The simplest case of an association between two binary variables can be summarized in a 2 by 2 table.

Examine membership in each individual subgroup and illustrate this with the simplest kind of categorical variable called the binary categorical variable (2 distinct groups rather than multiple distinct groups) - east to do a 2 by 2 table now

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

2 by 2 table for

A

(binary) categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

2 by 2 table terminology and features

A
factor = categorical variable 
level = one of the categories 

no one is counted twice (each individual is represented once)

Look at image then try draw one from scratch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

a =

A

frequency of samples in level 1 of factor 1 and level 1 of factor 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

b =

A

frequency of samples in level 1 of factor 1 and level 2 of factor 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

c =

A

frequency of samples in level 2 of factor 1 and level 1 of factor 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

d =

A

frequency of samples in level 2 of factor 1 and level 2 of factor 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

relative risk define

A

ratio of two probabilities/proportions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

risk difference define

A

difference between two probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

odds ratio define

A

ratio of two odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

RR features

A
Relative risk (RR) gives the risk of an outcome relative to “exposure”.
It is calculated as the ratio of the risk of an outcome for an exposed and an unexposed group

look at how much bigger one estimate of risk is compared to the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

RR equation

A

It is calculated as the ratio of the risk of an outcome for an exposed and an unexposed group….
RR = (a/(a+b) ) / (c/(c+d))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

RR interpretation

A

_______ were _____ times more likely to be _____ during (time period of study) than _____

alternatively turn the number into a percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

RR=1

A

there is no association between outcome and exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

RR conventional layout

A

conventionally the group ‘exposed’ to our exposure of interest goes on the top and ‘unexposed’ on the bottom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

RR less than one

A

protective factor for group that is the numerator

17
Q

RR greater than one

A

increased risk for numerator group

18
Q

RR and two by two tables

A

simplified measure with a two by two table means that we cannot control for confounding

think about confounding - simple table cannot control for confounding therefore just need to be aware, more sophisticated stats process is required

19
Q

Risk difference is …

A

The risk difference (RD) is given by the difference in the risk for the two groups: a/(a + b) − c/(c + d)

also called attributable risk because when you calculate the raw difference like this you are saying there is an additional amount of risk that is attributable to the exposed group

work out raw difference between the risks of 2 groups

20
Q

RD equation

A

a/(a + b) − c/(c + d)

21
Q

RD interpretation

A

in every (number) there will be _____ more/less injuries than in (same number) backs

gives you a per person estimate, different way of expressing the variable in response to the outcome

22
Q

Odd ratio is …

A

The odds ratio (OR) compares the odds of an outcome for two groups

measure of those who satisfy a condition over those who do not whereas risk = who have/total

for example it is number of successes over number of failures

exposed vs unexposed group with respect to the outcome

Ratio of the odds of the outcome for the exposed group to that for the unexposed group

23
Q

Difference between RR and OR

A

measure of those who satisfy a condition over those who do not whereas risk = who have/total

distinct difference between RR and OR in terms of denominator

24
Q

odds in exposed =

A

a/b

25
Q

odds in unexposed =

A

c/d

26
Q

odds ratio equation

A

ad/bc

27
Q

Symmetry in terms of OR

A

Note: that because it doesn’t use the row or column total, the odds ratio is symmetric with respect to the rows and columns in the table. This means there is no mathematical distinction between exposure and outcome variables.

symmetry - you will get exactly the same number for your odds ratio if you worked out the odds of the outcome in the exposed group vs the odds of the outcome in the not exposed group and you would get exactly the same number if you completely flipped the odds ratio
Respect to exposure rather than outcome, asking a different question but will get the same number and this works because these calculations do not take into account the row or column totals

In a case-control study the design compares the odds of exposure in cases to the odds of exposure in controls

This gives the same odds ratio as the ratio of the odds of disease in the exposed to the odds of disease in the unexposed.
This symmetry property of the odds ratio makes it particularly useful for quantifying associations between binary variables where there is no “direction” e.g. alcohol consumption (Yes/No) and Smoking (Yes/No), or fever and diarrhoea.

28
Q

Case control studies - OR vs RR and why

A

case controls should be using OR to analyse their outcome vs exposure as RR is easily manipulated by changing the number of controls

the relative risk can be made to vary by varying the number of controls selected. this means it is not an estimate of anything useful in a case control study - estimate of RR can be changed easily by simply changing the number of controls so if we are in a study where we ourselves tailor the number of controls then RR is useless therefore use OR. Odds is not affected by the total sample size of the controls

29
Q

Rare disease

A

for rare diseases the number of people with a disease is small therefore a and c are small

therefore a/b≈ a/a+b and c/d≈ c/c+d
Then relative risk = a/(a+b)/c/(c+d) ≈ a/b/ c/d , i.e. RR ≈ OR