lecture 9 Flashcards
(29 cards)
Associations between two categorical variables
Contingency tables are often used to record and analyse the relationship between two or more categorical variables.
Recall, “Categorical” means that the data can be separated into mutually exclusive categories. Usually summarized in the form of count.
Examples: smoker/non-smoker, treated/control, disease/disease free.
The simplest case of an association between two binary variables can be summarized in a 2 by 2 table.
Examine membership in each individual subgroup and illustrate this with the simplest kind of categorical variable called the binary categorical variable (2 distinct groups rather than multiple distinct groups) - east to do a 2 by 2 table now
2 by 2 table for
(binary) categorical variables
2 by 2 table terminology and features
factor = categorical variable level = one of the categories
no one is counted twice (each individual is represented once)
Look at image then try draw one from scratch
a =
frequency of samples in level 1 of factor 1 and level 1 of factor 2
b =
frequency of samples in level 1 of factor 1 and level 2 of factor 2
c =
frequency of samples in level 2 of factor 1 and level 1 of factor 2
d =
frequency of samples in level 2 of factor 1 and level 2 of factor 2
relative risk define
ratio of two probabilities/proportions
risk difference define
difference between two probabilities
odds ratio define
ratio of two odds
RR features
Relative risk (RR) gives the risk of an outcome relative to “exposure”. It is calculated as the ratio of the risk of an outcome for an exposed and an unexposed group
look at how much bigger one estimate of risk is compared to the other
RR equation
It is calculated as the ratio of the risk of an outcome for an exposed and an unexposed group….
RR = (a/(a+b) ) / (c/(c+d))
RR interpretation
_______ were _____ times more likely to be _____ during (time period of study) than _____
alternatively turn the number into a percentage
RR=1
there is no association between outcome and exposure
RR conventional layout
conventionally the group ‘exposed’ to our exposure of interest goes on the top and ‘unexposed’ on the bottom
RR less than one
protective factor for group that is the numerator
RR greater than one
increased risk for numerator group
RR and two by two tables
simplified measure with a two by two table means that we cannot control for confounding
think about confounding - simple table cannot control for confounding therefore just need to be aware, more sophisticated stats process is required
Risk difference is …
The risk difference (RD) is given by the difference in the risk for the two groups: a/(a + b) − c/(c + d)
also called attributable risk because when you calculate the raw difference like this you are saying there is an additional amount of risk that is attributable to the exposed group
work out raw difference between the risks of 2 groups
RD equation
a/(a + b) − c/(c + d)
RD interpretation
in every (number) there will be _____ more/less injuries than in (same number) backs
gives you a per person estimate, different way of expressing the variable in response to the outcome
Odd ratio is …
The odds ratio (OR) compares the odds of an outcome for two groups
measure of those who satisfy a condition over those who do not whereas risk = who have/total
for example it is number of successes over number of failures
exposed vs unexposed group with respect to the outcome
Ratio of the odds of the outcome for the exposed group to that for the unexposed group
Difference between RR and OR
measure of those who satisfy a condition over those who do not whereas risk = who have/total
distinct difference between RR and OR in terms of denominator
odds in exposed =
a/b