Epi/Biostats Flashcards

1
Q

Name 4 ways of dealing with confounding?

A
  1. randomization 2. stratification 3. matching 4. regression modelling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define Per-Protocol Analysis (PP) and Intention-to-Treat Analysis (ITT)

A

Per-Protocol Analysis (PP) Strategy of analysis in which only patients who complete the entire study are counted towards the results Intention-to-Treat Analysis (ITT) When groups are analyzed exactly as they existed upon randomization (i.e. using data from all patients, including those who did not complete the study)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What kind of study design is this? A study that examines the distribution of BMI by age in Ontario at a particular point in time.

A

cross-sectional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s the difference between Pearson and Spearman correlation?

A

Different types of correlation are used for different levels of measurement. Pearson is for continuous and Normal data, Spearman is for ordinal or non-Normal data. The Spearman correlation is the same as the Pearson correlation, but it is used on data from an ordinal scale The difference between the Pearson correlation and the Spearman correlation is that the Pearson is most appropriate for measurements taken from an interval scale, while the Spearman is more appropriate for measurements taken from ordinal scales. Examples of interval scales include “temperature in Farenheit” and “length in inches”, in which the individual units (1 deg F, 1 in) are meaningful. Things like “satisfaction scores” tend to of the ordinal type since while it is clear that “5 happiness” is happier than “3 happiness”, it is not clear whether you could give a meaningful interpretation of “1 unit of happiness”. But when you add up many measurements of the ordinal type, which is what you have in your case, you end up with a measurement which is really neither ordinal nor interval, and is difficult to interpret.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What statistical test will you use to compare the mean values of an outcome variable between two groups (e.g. difference in average BP between men and women)

A

Two-sample Z-Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What statistical test will you use to test the correspondence between a theoretical frequency distribution and an observed frequency distribution (e.g. if one sample of 20 patients is 30% hypertensive and another comparison group of 25 patients is 60% hypertensive)

A

A chi-squared test determines if this variation is more than expected or due to chance alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define a Secondary Attack Rate and show how to calculate it?

A

• the proportion of individuals who develop disease as a result of exposure to primary contacts during the incubation period measure of infectiousness, which reflects the ease of disease transmission • = [(total # of cases - initial # of cases) / (# of susceptible individuals - initial # of cases)] * 100%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Investigators intend to study the causes of a rare cancer using a case-control study.

They plan to recruit cases from a national cancer registry, including those diagnosed within the last 3 years.

To avoid bias associated with interviewing proxies about case exposure histories, they decide to exclude all deceased cases, about 40% of the cases in the registry.

Discuss the advantages and disadvantages of this approach.

A

● Advantages: Avoids differential recall between proxies and cases. Both may be affected by recall bias, but proxies may be unaware of exposure histories during certain time periods (e.g. childhood, young adulthood). Reduce ethical issues that might occur is partners of deceased needed to be contacted.

● Disadvantages: likely excludes most severe disease, meaning cases are less representative of cases in the population. Exposure may impact risk of severe disease differently than less severe disease. Lose study power.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

To evaluate a new back school, patients with lower back pain were randomly allocated to either the new school or to conventional occupational therapy. After 3 months they were questioned about their back pain, and observed lifting a weight by independent monitors.

What kind of study design is this?

A

Randomized controlled trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

To investigate the relationship between certain solvents and cancer, all employees at a factory were questioned about their exposure to an industrial solvent, and the amount and length of exposure measured. These subjects were regularly monitored, and after 10 years a copy of the death certificate for all those who had died was obtained.

What kind of study design is this?

A

Cohort study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

You have received a request to develop a surveillance program focussed for COVID-19. List and briefly describe six attributes of a surveillance system as it would pertain to COVID-19?

A
  1. Simplicity: the flow of information is simple (few providers, few systems, same IT structure, easy operation)
  2. Flexibility: the system is able to adapt to changing information needs or operating conditions – new providers of information, new data requirements, new case definition etc
  3. Data quality: complete and valid
  4. Acceptability: willingness of persons and organizations to participate in the surveillance system
  5. Sensitivity: proportion of cases detected by the system & ability to detect outbreaks and monitor trends
  6. Specificity: proportion of cases reported that actually have the disease/event of interest
  7. Timely: speed between steps
  8. Stable: reliable (able to collect info without downtime) and available (accessible to users when they need to know)
  9. Representative: represents health information trends by person, place, and time (trends)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Identify 5 mechanisms by which residual confounding can occur after attempts to control for confounding have been made in both the study design and analysis

A
  1. Randomisation - too small of a sample or errors in randomising
  2. Restriction & matching - not tight enough e.g. large age range where comparison groups end up with different age structures
  3. Not all confounders were accounted for in the analysis because data on them was not collected
  4. There were misclassification errors of confounders
  5. Categorisation of confounders was not tight enough e.g. too large of age bands
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do you understand by active and passive surveillance? Provide an example of each.

A

Active Surveillance

  • Outreach such as visits or phone calls by the public health/surveillance authority to detect unreported cases
  • e.g. an infection control nurse goes to the ward and reviews temperature charts to see if any patient has
  • a nosocomial infection

Passive Surveillance

  • A surveillance system where the public health/surveillance authority depends on others to submit standardized forms or other means of reporting cases
  • e.g. ward staff notify infection control when new cases of nosocomial infections are discovered
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define standardization.

When do you use Direct Standardization and Indirect Standardization?

A

Definition: A statistical method used to calculating summary rates of health outcomes that are adjusted to take into account confounders (e.g. age). The standardized rates allow for a less distorted comparison between 2 + populations, showing how overall rates of disease/mortality compare if the 2+ populatoins hypothetically have the same distribution of confounder (e.g. same age distribution)

  • Direct Standardization: Used when age-specific rates of disease/mortality are known for the populations being compared.
  • Indirect Standardization: Used when it is difficult to obtain reliable estimates of age-specific rates due to small number of observations (therefore unstable rates, or random error)
    *
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

List 3 ways to reduce interviewer bias

A
  1. Use standardized questionnaires consisting of closed-end, easy to understand questions with appropriate response options.
  2. Train all interviewers to adhere to the question and answer format strictly, with the same degree of questioning for both cases and controls.
  3. Obtain data or verify data by examining pre-existing records (e.g., medical records or employment records) or assessing biomarkers.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

List 5 ways to reduce loss to follow up

A
  1. Enrolling motivated subjects
  2. Using subjects who are easy to track
  3. Making questionnaires/follow-up processes as easy to complete as possible
  4. Maintaining the interest of participants and making them feel that the study is important
  5. Providing incentives
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

List 4 general ways to reduce recall bias in a case-control study

A
  1. Use a control group that has a different disease (that is unrelated to the disease under study).
  2. Use questionnaires that are carefully constructed in order to maximize accuracy and completeness. Ask specific questions.
  3. For socially sensitive questions, such as alcohol and drug use or sexual behaviors, use a self-administered questionnaire instead of an interviewer.
  4. If possible, assess past exposures from biomarkers or from pre-existing records.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When do you use student’s T test?

A

1) . Used to compare the mean of a normally distributed population when the sample size n is small
e. g One sample t-test: Used tocompare the mean of a single group with small sample size against a known mean
e. g. Independent sample t-test compares the means for two groups with small sample size
e. g. Paired sample t-test compares the means from the same group with small sample size at different time (e.g. before and after)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When do you use chi-squared test?

A

Pearson’s chi-squared test is used to analyze categorical data to determine whether there is a statistically significant difference between the expected frequenceis and the observed frequencies in on ore more categories of a contigency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

List 3 assumptions of Cox Proportional Hazards analysis

A
  1. Independence of survival times between distinct individuals in the sample
  2. A multiplicative relationship between the predictors and the hazard
  3. A constant hazard ratio over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

List 4 assumptions of linear regression modelling

A
  1. Linearity: The relationship between X and the mean of Y is linear.
  2. Homoscedasticity: The variance of residual is the same for any value of X.
  3. Independence: Observations are independent of each other.
  4. Normality: For any fixed value of X, Y is normally distributed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

List 5 assumptions of logistic regression

A
  1. Assumption of independence of observations
  2. Assumption of the absence of multicollinearity
  3. Assumption of linearity of independent variables and log odds
  4. Assumption of large sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A recently published cohort study found that the relative risk of acute myocardial infarction in blood donors compared to non-donors was 0.14 (95% confidence interval 0.02 to 0.97; p=0.047).

a) With reference to cohort studies, define what is meant by the term ‘Person-time’
b) When reporting the results of epidemiological studies, why are confidence intervals preferred to p-values?
c) Interpret the values given above for the relative risk and the 95% confidence interval

A

a) A measurement combining (i.e. adding) persons and follow-up time as the denominator in the calculation of incidence/mortality rates when individual subjects are at risk of developing disease for varying time periods (i.e. a disease(s)of interest) or dying
b) CIs are preferable to p-values as they provide the range of possible effect sizes around the measure of impact (incidence/prevalence) or association (RR, OR) where there is a 95% probability that the true value of impact/association lies
c) RR - Blood donors had 86% lower risk of MI. RR of <1 suggests a protective effect or negative association

95% CI - if the study was repeated 100 times on the same population then 95 times the true value would lie within the 95% CI. The wide CI suggests a smaller sample size or population with large variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

A) List 5 key criteria for population-based screening and provide a brief comment on how it applies to abdominal aortic aneurysm screening in Canada

A
  1. Condition is an important public health issue - AAA is a significant and preventable cause of morbidity and mortality in older males
  2. Screening test is acceptable - abdominal ultrasound is a non-invasive test that is widely available
  3. Natural history is known - the natural history of AAAs is understood including rate of growth according to size
  4. There is an agree policy on who to treat - there are algorithms developed to determine relevant treatment and monitoring options according to size and rate of change of AAA
  5. There is a defined population - in Canada, males aged 65-80 are recommended a one time abdominal ultrasound
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

You are conducting a population health survey for your regional health authority

List FIVE strategies to maximise the response rate of your survey

A
  1. Keep survey as short as possible
  2. Clear layout and careful design of questionnaire
  3. Pilot the questionnaire first to identify any issues with usability or comprehension
  4. Use appropriately timed follow-up reminders to complete the survey
  5. Use simple language, short sentences etc.
  6. Personalised covering letter/email conveying the reasons for the survey and its value
  7. Clear statements of confidentiality
  8. Consider use of telephone and web based administration of survey – but may introduce biases
  9. Ensure that written materials are available in appropriate languages
  10. Offer help to specific groups (e.g. elderly, blind, poor literacy skills)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is small study bias wrt meta-analyses

A

Is the tendency for the smaller studies in a meta-analysis to show larger treatment effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Systematic Review
List and explain 5 criteria that you would use to evaluate the quality of a new systematic review on the health harms of e-cigarette use in adults.

A
  1. Did the review questions and inclusion criteria define the population, intervention, and outcome
  2. Did the review use a comprehensive search strategy and provide appropriate details?
  3. Were the methods established prior to conducting the review?
  4. Did the review provide a list of included and excluded studies with justification and adequate details?
  5. Did the review authors use a satisfactory technique to assess the risk of bias in included studies?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Qualitative Methods
Increasing rates of syphillis have been reported in many MSM populations in Canada. Describe 3 qualitative research data collection strategies and provide examples which might be used to try to identify reasons for increasing rates of unprotected sex in this group.

A
  1. group interviews - ex. focus group. Group of people are asked about their perceptions, opinions, beliefs, in an interactive group setting where participants are free to talk with other group members.
    Example - interview with a group of MSM individuals to discuss barriers to protection during sex
  2. individual interviews - ex. Key informant interview
    in depth 1:1 conversations
    Example: in depth discussions in 1:1 setting with single cases
  3. Ethnographic (ex. observation) - researchers interact with participants in real life settings to understand behaviour drivers.
    Example: researchers go to nightclubs and observe how individuals approach meeting new partners and availability of safer sex educational material and products (i.e. accessibility of condoms)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Qualitative Methods
List three types of qualitative research data collection strategies and advantages and disadvantages of each one

A

1) group interviews - ex. focus group.- group of people are asked about their perceptions, opinions, beliefs, in an interactive group setting where participants are free to talk with other group members.
Advantage: individuals can interact with another during the setting and generate group based insights
Disadvantage: Social desirability (individuals expressed views may differ from actual views due to presence of other and pressure to conform)

2) individual interviews - ex. Key informant interview
Advantages:

Disadvantage:
One of the main criticisms is that the data collected cannot necessarily be generalised to the wider population.

3) Ethnographic (ex. observation) - researchers interact with participants in real life settings to understand behaviour drivers.
Advantage: Can help identify unexpected issues that the researchers did not know to ask about
Disadvantage: costly time intensive, and in certain circumstances not feasible for ethical or privacy reasons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What are 6 steps of a systematic review

A

Framing the Question

Develop a study Protocol

Identifying the Relevant Work

Assessing the Quality of Evidence

Summarizing the Findings

Interpreting the Evidence

https://www.lib.uwo.ca/tutorials/how_to_perform_a_systematic_review/index.html

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

You are a provincial health consultant who is in charge of a new screening program for a cancer. What are 5 adverse effects that can occur from screening?

A
  1. False Positives: individual tests positive without disease and experiences further invasive procedures or treatment
  2. False Negatives: Individuals tests negative for disease even though they have it potentially resulting in delayed care or interventions
  3. Stigma: individuals who screen positive may experience stigma due to nature of condition
  4. Over-diagnosis: individuals screen positive for the disease or condition but would never have experienced symptoms or premature mortality from it
  5. Increased health inequities: participation and benefits from the screening program are not distributed in an equitable manner resulting in widening disparities.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

List three types of bias when evaluating a screening program that may result in the appearance of increased benefits

A

Lead time bias - individuals identified through screening program are found earlier in disease course that those diagnosed without screening but do not survive longer from disease onset

Length time bias - screening is more likely to detect individuals with a slowly progressing from of disease compared to more aggressive disease

Selection bias - individuals who participate in screening program may be healthier, better lifestyles, and more likely to adhere to therapy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Define a cross sectional study design

A

An observational study where exposure and outcomes are assessed at a single point in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

List 3 strengths and 3 weaknesses of cross sectional study design

A

Strengths:

  • Inexpensive
  • Quick
  • Can assess multiple exposures and outcomes
  • Can calculate prevalence

Weaknesses:

  • Cannot assess time trends
  • Cannot assess causation
  • Difficult to assess rare outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is the key difference between a cohort and case control study

A

In a case control study the outcome is known and the exposure status between cases and non cases is investigated, in a cohort study the exposure is known and the outcomes in the exposed and non exposed groups are investigated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

When can an odds ratio be considered equivalent to a relative risk

A

When the outcome is relative rare in the investigation (<20%)

If the odds ratio is interpreted as a relative risk it will always overstate any effect size: the odds ratio is smaller than the relative risk for odds ratios of less than one, and bigger than the relative risk for odds ratios of greater than one

37
Q

A recent study examined the association between avoidable mortality and neighbourhood marginalization. The study took neighbourhood marginalization (a continuous variable) and divided it into quintiles. What are
3 limitations and 3 benefits of this approach

A

Limitations
- reduces ability to compare results between studies in different settings as quintiles may not represent the same exposure in different regions (analagous to comparing two studies on the association between poverty and health, when poverty was defined as an annual income of $22,000 in one and $47,000 in the other)

  • increases liklihood of false positive due to multiple testing
  • assumption that risk is homegenous within each category (i.e. within lowest quintile bottom 5 percent might be very different than next 15 percent)

Advantages
- Facilitates communication to the lay public and decision makers

  • Can be easily used to divide groups into levels of risk
    with a relative risk for each group
  • facilitates interpretation of statistical interaction (effect measure modification) tests (interactions between continuous variables are difficult to interpret)
38
Q

What is the difference between direct and indirect standardization?

What data do you need to do direct standardization?

What data do you need to do indirect standardization?

A

Direct Standardization

  • Uses the age structure of a reference population and the known event rates by age or sex in two populations to create standardized rates
  • need the number of events and the number of individuals in the population for multiple age ranges in two separate populations

Indirect Standardization

  • Uses the known age specific rates of a reference population to calculate the expected overall number of cases in the population of interest
  • need to age specific reference rates and the observed number of cases / number of individuals in a population for 1 population
39
Q

What are essential components of a surveillance system?

A
  • – What is the population under surveillance?
  • – What is the period of time of the data collection?
  • – What data are collected and how are they collected?
  • – What are the reporting sources of data for the system?
  • – How are the system’s data managed (e.g., the transfer, entry, editing, storage, and back up of data)? Does the system comply with applicable standards for data formats and coding schemes? If not, why?
  • – How are the system’s data analyzed and disseminated?
  • – What policies and procedures are in place to ensure patient privacy, data confidentiality, and system security? What is the policy and procedure for releasing data? Do these procedures comply with applicable federal and state statutes and regulations? If not, why?
  • – Does the system comply with an applicable records management program? For example, are the system’s records properly archived and/or disposed of?
40
Q

How do you calculate a attributable fraction (attributable risk percent)

If the a study find a an AR of 30% between lung cancer and radon how would you interpret this finding?

A

AR = Incidence exposed - incidence unexposed / incidence exposed *100

30% of lung cancer cases in individuals exposed to radon are attributable to the radon exposure

41
Q

How do you calculate the population attributable risk

If the a study find a PAR of 0.02 between lung cancer and radon how would you interpret this finding?

A

Incidence total population - incidence unexposed

For every 1000 individuals in this population 20 will develop lung cancer due to exposure to radon.

42
Q

How do you calculate the population attributable risk percent (population attributable fraction)

If the a study find a PAF of 10% between lung cancer and radon how would you interpret this finding?

A

Incidence total population - incidence unexposed / incidence total population *100

10% of the lung cancer in this population are due to exposure to radon.

43
Q

Define Attributable Risk

A

The number of cases of disease in the exposed population that can be attributed to the exposure

44
Q

You have completed a case - control study examining the association between exposure to radon and lung cancer. In order to calculate a population attributable fraction from this study what assumption must you make?

A

Must assume that the prevalence of exposure in controls representative of the prevalence of exposure in the total population.
Outcome of interest should be rare in order for OR to approximate in the total population.

45
Q

Calculate a population attributable fraction from radon and lung cancer given an odds ratio of 3.5 of observing radon in cases vs controls and a prevalence of exposure to radon of 30% in the cases and 8.5% in the controls.

A

(OR-1/OR)* prevalence of exposure in cases

46
Q
A

Variable other than the main one you are studying (e.g. age) that is associated with both the exposure (e.g. activity level) and outcome (e.g. heart disease), distorting the true relationship between the exposure and outcome

o A confounder is not on the causal pathway

o A confounder must be associated with the exposure and must also be a risk factor for, or affect the probability of recognizing, the outcome

Example §

Association between gray hair and heart attack crude RR = 2.0 § Age is associated with both exposure and disease, and is not on causal pathway (potential confounder)

§ After controlling for age, adjusted RR = 1.0, earlier effect was due to the confounder (age) and not the exposure (gray hair)

47
Q

How do you control for confounders?

A

Study design

  • Randomization: selection method decides who is exposed and who is unexposed (e.g. random number generator)
  • Restriction: limit enrollment based on known cofounders (e.g., do not include alcohol drinkers)
  • Matching: based on potential confounders (e.g., match cohort based on age)

§ Analysis

  • Stratification: by a particular variable (e.g. age, sex)
  • Standardization: such as standardizing for age
  • Multivariate analysis: control for multiple confounders using regression models
48
Q

What is an effect modifier?

A

Mediating factors

o Intermediate process between exposure and outcome

o Examples

§ Obesity à hypertension à heart disease

§ Salt intake à hypertension à stroke

• Effect modification

o Real relationship between the exposure and the outcome and a third variable modifies the direction or magnitude of that effect

§ Synergistic: positive interaction

§ Antagonistic: negative interaction

o Example

§ Cigarette smoking modifies the effect of radon exposure on the development of lung cancer

§ Individuals exposed to radon who smoke cigarettes have a much higher risk of lung cancer than individuals exposed to radon who do not smoke cigarettes

49
Q

Necessary, sufficient and

A

Necessary causes:

without cause the effect cannot occur, however, sometimes the cause occurs without the effect

§ Example: Infectious diseases: pathogen à disease

o Sufficient causes: effect must always occur when cause is present

§ Example: decapitation à death o Some causes are neither necessary or sufficient

§ Example: gonorrhea à pelvic inflammatory disease

50
Q

Bradford Hill criteria

A

Criteria (SSPACCE-TB)

Strength -Larger the effect size the more likely the association is causal

Specificity-Single risk factor consistently relates to a single effect

Plausibility- Effect must have biologic plausibility

Analogy - When a factor is suspected of causing an effect, then others factors similar or analogous to the supposed cause should also be considered and identified as a possible cause, or otherwise eliminated from the investigation

Consistency -Reproducibility of association in various populations and situation

Coherence - Any new data should not be in opposition to the current evidence

Experiment - Association demonstrated in experimental evidence

Temporality -Exposure must always precede the outcome

Biological gradient (Dose – Response) Causal relationship is more likely if a dose-response curve can be demonstrated

51
Q

Absolute risk and relative risk

A

Absolute risk: incidence of a disease in a population

o Absolute risk in exposed (Ie) = a/a+b o Absolute risk in unexposed (Iu) = c/c+d

• Relative Risk or Risk Ratio (RR)

o Ratio of the absolute risk of a disease among the exposed group to the absolute risk of the disease among the non-exposed group

o Quantifies how much more (hazard) or less (protective) exposed persons develop an outcome

o Purpose: measure used to assess the possibility of a causal relationship

o Calculation:

§ Incidence of outcome in exposed (Ie) divided by incidence of outcome in unexposed (Iu)

§ = Ie / Iu = (a/a+b) / (c/c+d)

52
Q

relative risk reduction

A

Relative Risk Reduction (RRR)

o Relative decrease in the risk of an adverse event in the exposed group compared to the unexposed group

o Calculation: [incidence of outcome in exposed (Ie) – incidence of outcome in unexposed (Iu)] divided by the incidence of outcome in unexposed (Iu)

§ = (Ie - Iu) / Iu § = (a/a+b – c/c+d) / (c/c+d)

53
Q

Attributable risk

Risk difference

Absolute risk reduction are all the same thing . What is it?

A

Attributable Risk (AR) or Risk Difference (RD) or Absolute Risk Reduction (ARR)

o The number or incidence of cases of a disease among exposed individuals that can be attributed to that exposure

o Considers only those exposed o Purpose: measure of the potential for prevention of disease if the exposure could be eliminated

o Calculation: incidence of outcome in exposed (Ie) – incidence of outcome in unexposed (Iu) OR risk in exposed minus risk in unexposed

§ = Ie - Iu

§ = (a/a+b) - (c/c+d)

54
Q

Attributable risk percent

Attributable risk fraction

A

Attributable Risk Percent (AR%) or Attributable Fraction (AF) or Attributable Risk in Exposed (ARe)

o The percentage or proportion of cases of a disease among exposed individuals that can be attributed to that exposure as a percentage/fraction

o Considers only those exposed

o Calculation: [incidence of outcome in exposed (Ie) – incidence of outcome in unexposed (Iu)] divided by the incidence of outcome in exposed (Ie)

§ = (Ie - Iu) / Ie

§ = (a/a+b – c/c+d) / (a/a+b)

55
Q

population attributable risk

A

Population Attributable Risk (PAR)

o The number or incidence of cases of disease in the population that can be attributed to an exposure o Considers both those exposed and unexposed in the entire population

o Purpose: to determine the number of cases of disease that would not occur in a population if the factor were eliminated or no one was exposed

o Calculation: incidence of disease in total population (IT) – incidence of disease in population unexposed (IU)

§ = IT – IU

§ = [(a+b) / (a+b+c+d)] – (c/c+d)

56
Q

population attributable risk percent

population attributable risk fraction

A

Population Attributable Risk % (PAR%) or Population Attributable Fraction (PAF) o The percentage or proportion of cases of disease in the population that can be attributed to an exposure o Considers both those exposed and unexposed in the entire population o PAF is the proportional reduction in population disease or mortality that would occur if an exposure to a risk factor were reduced to an alternative ideal exposure scenario (e.g., no tobacco use) o Many diseases are caused by multiple risk factors, and individual risk factors may interact in their impact on overall risk of disease. As a result, PAFs for individual risk factors often overlap and add up to more than 100 percent. o Calculation: [incidence of disease in population (IT) – incidence of disease in population unexposed (IU)] divided by incidence of disease in population (IT)

§ = (IT – IU) / IT

o Alternate calculation: Pexposed= Prevalence in exposed

PAF = [Pexposed*(RRexposed -1)] / [(Pexposed *(RRexposed -1) + 1]

57
Q

Calculations where no 2x2 table is given

Example: a cohort study found the absolute risk among smokers of getting bowel cancer is 20%, and the absolute risk among non-smokers of getting bowel cancer is 8%. 15% of Canadians are smokers o Calculate the 2x2 table

calculate relative risk, attributable risk, attributable risk fraction and population attributable risk percent

A

Example: a cohort study found the absolute risk among smokers of getting bowel cancer is 20%, and the absolute risk among non-smokers of getting bowel cancer is 8%. 15% of Canadians are smokers o Calculate the 2x2 table

§ Ie = 20% or 0.2,

Iu = 8% or 0.08

§ Using 15% of Canadians are smokers: 0.15 X 1,000 = 150 smokers and therefore 850 non-smokers

§ a = 150 X 0.2 = 30

§ c = 850 x 0.08 = 68

§ IT = 30 + 68 /1,000 = 98/1,000 = 0.098

Relative risk
§ RR = Ie /Iu = 0.2/0.08 = 2.5
§ Interpretation: smokers are 2.5 times more likely to develop bowel cancer than nonsmokers

o Attributable risk
§ AR = Ie – Iu = 0.2 – 0.08 = 0.12
§ Interpretation: people who smoke have 12 additional cases of bowel cancer per 100
people compared to non-cancers

o Attributable risk in the exposed
§ ARe = (Ie –Iu )/Ie = 0.12/0.2 =60%
§ Interpretation: among smokers, 60% of bowel cancers are attributable to smoking

o Population attributable fraction
§ PAF = [Pexposed*(RRexposed -1)] / [Pexposed *(RRexposed -1) + 1)]
= [0.15 * (2.5-1)] / [0.15 * (2.5-1) + 1]
= 0.225/1.225 = 0.18
§ PAF = IT – Iu / IT = (0.098 – 0.08)/0.098 = 0.18
§ Interpretation: 18% of bowel cancers in the population can be attributed to smoking

58
Q

odds ratio calculations

A

Odds ratio (OR)

o Probability of an event occurring relative to it not occurring o In case-control studies, the incidence of the disease in the exposed or unexposed is unknown (because the study starts by identifying cases)

o Rare disease assumption: the OR approximates the RR when the outcome is rare (in this case, cells a and c are small and do not contribute much to the denominator and a/a+b ≃a/b and c/c+d ≃c/d)

o Calculation: Odds of disease in exposed / Odds of disease in unexposed § = (a/b) / (c/d)

§ = ad/bc

59
Q

number needed to treat

A

Number needed to treat (NNT)

o Number of patients who need to be treated to prevent one additional bad outcome

o Always round up to nearest whole number, can’t have 0.8 persons

o Calculation = 1/ ARR

60
Q

Difference between direct and indirect standardization

A

Both approaches compare populations to a standard reference population

o Direct standardization: used when stratum-specific numbers/rates for two or more populations are known (e.g., number of deaths in each age group for two populations)

o Indirect standardization: used when only overall numbers/rates for two or more populations are known (e.g. overall number of deaths for two populations) or when stratum-specific numbers in the study population are small, leading to unstable stratum-specific rates. The study population provides the weights while the standard population provides the rates

61
Q

Differences between direct and indirect standardization

Differences between direct and indirect standardization

A

Direct Results- Age-adjusted rates indirect results- Standardized Mortality Rate (SMR)

direct- Good for Descriptive purposes, comparison across studies

indirect- Rare or small event rates in study population Both

  • Give a single summary number for each study group
  • Number is based on a hypothetical circumstance, but circumstances are the same between groups, so comparison is made fairer
62
Q

Age specific standardized rates

Direct standardization example

A

Age-Standardized Mortality Rate (ASMR)

o Uses direct standardization and is expressed as an age-standardized rate (e.g., x number of deaths per 100,000 individuals)

  1. Select a standard reference population (usually Canada) and determine the number in each age group (using census data)
  2. Calculate age-specific death rates for each age group in each city

§ Deaths / population size * 100,000

§ Example: For Goosefoot, age 0-14: 1/897 * 100,000 = 111

  1. Calculate the number of deaths that would be expected in each age-group if the study populations had the same age structure as the standard reference population (Canada)

§ City age-specific mortality rate * reference population / 100,000

§ For Goosefoot, age 0 – 14: 111*5,607,345 / 100,000 = 6,251

  1. To get the overall ASMR, add up the number of expected deaths for each age-group for each city and then divide each by the total number of people in the standard population

§ Total expected deaths/size of reference population * 100,000

§ For Goosefoot: 231,021/33,476,690 * 100,000 = 690

* Interpretation: when the effect of age is removed (age-standardization) Goosefoot has a lower mortality rate than Weenigo (690 vs. 786 deaths per 100,000)

63
Q

Standardized Mortality Ratio (SMR)

A

Uses indirect standardization to calculate the ratio of the deaths observed in the study population to the number that would be expected if this population had the same structure as the reference population

  1. Obtain the age-specific death rates for the reference population (Canada)

§ Deaths / population * 100,000 in each age group

§ E.g. For Canada age 0 – 14: 2,451/5,607,345 * 100,000 = 43.7 2. Multiply these rates by the number of people in each study population in each age group to calculate the expected number of deaths. These show the number of deaths that would occur in the study population if each age stratum had the same death rate as in the reference population

§ Canadian age-specific death rate * population of that age group in the study population / 100,000

§ E.g. For Goosefoot, age 0-14: 43.7 * 897 / 100,000 = 0.39 3. Add up the expected number of deaths for each age group

§ E.g. for Goosefoot: 0.39 + 19.1 + 139.4 = 158.89 4. Calculate the ratio between observed deaths and expected deaths in the study population

§ Observed/ Expected deaths * 100

§ E.g. for Goosefoot: 132 / 158.9 * 100 = 83

* Interpretation: Goosefoot is relatively healthy compared to the Weenigo region. Note that SMRs can be compared to the reference population (e.g. Goosefoot has a lower mortality than Canada overall which has a value of 100), but it can be misleading to compare between SMRs as each SMR reflects the agestructure of that region.

64
Q

Simpler Age-Standardized Mortality Rate (ASMR) example

A

Calculate the ASMR, standardized to the 1991 population o In 1991, 61.6% of Canadians were under 40 years of age and 38.4% were age 40 or older

Age-specific mortality rates

§ 2000

  • 0 to 39 years = 1,345 / 17,068,876 × 100,000 = 7.9 cancer deaths per 100,000 population
  • 40+ years = 61,325 / 13,616,854 × 100,000 = 450.4 cancer deaths per 100,000 population

§ 2011

  • 0 to 39 years = 1,004 / 17,191,850 × 100,000 = 5.8 cancer deaths per 100,000 population
  • 40+ years = 71,472 / 17,150,930 × 100,000 = 416.7 cancer deaths per 100,000 population

o Age-standardized mortality rate

§ 2000: (7.9 X 0.616) + (450.4 X 0.384) = 4.9 + 173.0 = 177.9 cancer deaths per 100,000 standard population

§ 2011: (5.8 X 0.616) + (416.7 X 0.384) = 3.6 + 160.0 = 163.6 cancer deaths per 100,000 standard population

65
Q

Sample size calculations

A

o Sample size determined by:

§ Hypotheses

§ Type I error rate

§ Power (1 – type II error rate)

§ A particular alternative value

§ An estimate of population variance

66
Q

Quantitative sampling methods

A

Random sampling methods

Simple random sample:

each individual in population has an equal chance of being selected

o Stratified sample: population first divided into strata, then simple random sampling is performed within each strata

o Cluster sampling: each group has an equal chance of being selected; examine all units within the chosen cluster (done because it’s easier and simpler)

o Multi-stage sampling: each group has an equal chance of being selected, then each individual within the selected group has an equal chance of being selected

67
Q

Qualitative sampling methods

A

snowball sampling

purposive sampling

theoretical sampling

convenience or volunteer sampling

68
Q

T -test

A

Used to compare two means

o Paired T-test: comparing two normal populations using dependent samples § Self-pairing (e.g., before and after)

§ Matched (e.g., twins, siblings)

§ Used to control for extraneous sources of variation that might otherwise influence results of comparison

o Independent T-test: comparing normal populations using independent samples

§ Two separate populations

  • Placebo vs treatment
  • Non-exposed vs exposed
  • Control vs case
69
Q

ANOVA

A

ANOVA o Analogous to t-test, but used to compare means across more than two populations

70
Q

Statistical tests

A

Statistical tests

  • T-test: compare two means
  • ANOVA: compare means across > 2 populations
  • Chi-square: compare observed counts with the expected counts in each cell given that the null hypothesis is true
  • Kappa correlation: tests reliability between measurements of the same variable
71
Q

Correlation

A

Correlation

o Quantification of the relationship between two random variables from -1 to +1

§ If > 0, variables are positively correlated (if X increases, so does Y)

§ If <0, variables are negatively correlated (if X decreases, so does Y)

§ If = 0, then the variables are not correlated (no linear relationship)

o Correlation does not imply causation

o Types

§ Pearson correlation: quantify association between two normal variables

§ Spearman correlation: quantify association between two ordinal variables

§ Chi-square test: test for association between two categorical variables

o Goodness of fit: How much of variance of Y is explained by the variance in X?

72
Q

Hybrid designs

A

Hybrid designs

  • Nested case-control: cohort with a case-control within in it; cases and controls are both drawn from a cohort
  • Case-crossover: case act as their own control; used for outcomes with rapid onset (e.g., MI, MVC) § Ecological: investigator did not assign the exposure; individual-level outcomes unknown
  • Ecological fallacy: drawing inferences at the individual level based on group-level data
  • Atomistic fallacy: drawing inferences at the group level based on individual-level data
73
Q

Nested case control

A

Nested Case-Control

o Cases and controls are selected from a well-defined cohort

o Usually all cases taken (or sampled randomly)

o Controls selected using incidence density sampling – sampled from those at risk of the outcome when each case occurs

o Pros: efficient, good for time-varying exposure o Cons: less precise than cohort, sampling may introduce bias, can only estimate rate ratio

74
Q

case-case study

A

Case-case study

o Background § Instead of comparing cases to a strategically selected group of non-diseased controls, case-case designs compare the “cases of interest” to “comparison cases,” or people ill with a different disease

§ Example: comparing different subtypes of the same disease such as MRSA and MSSA or comparing salmonella case data to campylobacter case data to understand risk factor differences

o Advantages

§ Reduces selection bias: cases of interest and comparison cases are selected from the same population that gave rise to the cases of interest (e.g., patients with a positive Staph aureus culture)

§ Reduces recall bias: comparison cases are likely to have similar patient histories and timelines, from onset of illness to collection of information, such as case history interviews or laboratory results

§ Less costly: lower cost as data are already available from case investigations (obviates the need for additional collection of information on controls)

75
Q

Test-Negative study

A

Test-Negative study o Variation of the case-control study, in which patients are enrolled in outpatient clinics (and/or hospitals) based on a clinical case definition such as influenza-like illness o Patients are then tested for influenza virus, and vaccine effectiveness is estimated from the odds ratio comparing the odds of vaccination among patients testing positive for influenza versus those testing negative

76
Q

Healthy worker effect

A

o Background: form of selection bias, generally workers will have less disease than the general population so workers are pre-selected to be healthier, workers who become ill often will retire/quit (self-select out of the workforce)

o Examples of how bias can cause the healthy worker effects

§ Selection bias at time of hiring, resulting in the entry into the workforce of healthier workers

§ Selection bias during employment, resulting in retention of healthier workers

§ Confounding bias introduced by the lack of information on the workers’ functional status (health status)

77
Q

Tri-council policy core principles

Research ethics

A

• Respect for persons • Concern for welfare: privacy, treatment of human bio materials • Justice: treat people fairly and equitably

78
Q

Funnel plot

A
  • Designed to check for the existence of publication bias and commonly used in systematic reviews and meta-analyses
  • In the absence of publication bias, it assumes that studies with high precision will be plotted near the average, and studies with low precision will be spread evenly on both sides of the average, creating a roughly funnel-shaped distribution
  • Deviation from a funnel shape can indicate publication bias
79
Q

critical appraisal

Quantitative studies (ODDCHAIR)

A
  • Objectives: study hypothesis
  • Design: appropriateness of the design to study question
  • Definitions: subjects, intervention, exposure, outcomes
  • Collection: measurement tools, sources of bias
  • Handling: confidentiality, ethics approval
  • Analysis: appropriate statistical testing
  • Interpretation: results, internal and external validity, relevance
  • Reporting: publication bias, conflicts of interest
80
Q

critical appraisal for systematic studies

A

Systematic reviews – AMSTAR criteria

  • PICO: review questions and inclusion criteria include PICO format
  • A priori protocol: established review methods prior to conducting review
  • Included study rationale: selection of study designs for inclusion
  • Comprehensive search: details on comprehensive literature search strategy
  • Duplication: study selection and data extraction performed in duplicate
  • Study list: provide list of excluded studies
  • Study details: describe included studies in detail
  • Risk of bias: assess risk of bias in individual studies included
  • Funding sources: reporting source of funding of studies included
  • Meta-analysis methods: appropriate methods for statistical combination of results for meta-analysis
  • Heterogeneity: describe heterogeneity in results
  • Publication bias: investigation of publication bias and impact on results
  • COIs: report conflict of interest and funding
81
Q

CONSORT for RCT

A

Title and abstract- title clearly states randomized include trial design, methods, main results and conclusions

Introduction- lit review

method- trial design, elegibility, how and where data was collected,

results-statistical results used,

discussion- limitations, internal and external validity

82
Q

How do you control for bias in a study?

A
  • Design: ensure study design is appropriate for addressing hypothesis
  • Procedures: establish procedures for identifying, enrolling, and following/retaining study subjects
  • Execution: establish procedures for data definitions, measurements, and collections
  • Analysis: use appropriately data analysis
  • Post hoc: quantify the potential magnitude of bias (bias analysis)
83
Q

Information Bias

A

Sources -

Respondents: inability to understand, recall, articulate, unwillingness to disclose

  • Data collectors: unclear or ambiguous questions, lack neutrality, inaccurate transcription
  • Data managers: inaccurate transcription, misreading, miscoding, programming errors
  • Data analysts: variable coding, programming errors - Imperfect measurement tools: tests with low sensitivity/specificity, poorly worded questionnaires

• Types - Measurement bias- misclassification

recall bias

interviewer bias

84
Q

Selection Bias

A

Selection bias •

Definition: systematic error in a study resulting from manner in which subjects are selected or from factors that influence ongoing participation

• Sources -

During enrollment: if exposure affects initial participation of cases and controls differently -

During follow-up: if outcome affects withdrawal of exposed and non-exposed subjects (attrition)

  • Database linkage study: there may be important differences in exposure and outcome status of subjects you were able to link between two databases and those you were not able to link

• Types -

Sampling bias: can occur when some members of a population are systematically more likely to be selected in a sample than others

  • Attrition bias: can occur when significant losses to follow-up result in a sample systematically different from the original in terms of exposure frequency or outcome susceptibility
  • Volunteer bias: can occur because those who volunteer for studies tend to be systematically different from the general population
  • Nonresponse bias: can occur because of systematic differences between those who participate in studies and those who do not
  • Publication bias: can occur when the outcome of an experiment or research study influences the decision whether to publish or otherwise distribute it
  • Incidence-prevalence bias: results from using prevalent cases in a case-control study, because cases must have survived until study recruitment
85
Q

Qualitative methods sampling strategies

A

Sampling strategies

  • Typical: sampling usual cases of a phenomenon
  • Deviant: sampling most extreme cases
  • Critical case: sampling cases that are predicted to be particularly illuminating, based on theory or previous research
  • Maximum-variation: sampling wide a range of perspectives to capture broadest set of experiences
  • Confirming-disconfirming: sampling cases whose perspectives are likely to confirm/challenge researcher’s understanding of the phenomenon
  • Theoretical: sampling cases whom the researchers predict would add new perspectives to those already represented in the sample
86
Q

critical appraisal of qualitative study

A

Qualitative methods: Quantitative

Trustworthiness

Internal validity

Credibility

Generalizability

Transferability

Reliability

Dependability

Objectivity

Confirmability

data collection -observation-participant, observer: complete participant, participant as observer, observer as participant, complete observer,

and interviews-focus group, nominal, individual, group, Delphi,

87
Q

Types of qualitative studies

A

o Ethnography: observational or descriptive questions about values, beliefs, and practices of a cultural group

o Phenomenological: questions about meaning or essence of phenomena

o Grounded theory: process questions about how the experience has changed over time or about its stages and phases

88
Q

How do you account for confounders during design and analysis stage?

A

Confounders during analysis- stratification, multivariable regression analysis, standardization, instrumental variable estimation, inverse probability

Confounders during design stage- randomization, restriction, matching, blinding

89
Q

What is the main characteristic of mixed methods research, list 3 reasons to use mixed methods. Main characteristic

A

: It allows the analysis of both qualitative and quantitative data 3 Reasons to use mixed methods: https://www.scribbr.co.uk/research-methods/mixed-methods/

  • Generalisability: Qualitative research usually has a smaller sample size, and thus is not generalisable. In mixed methods research, this comparative weakness is mitigated by the comparative strength of ‘large N’, externally valid quantitative research.
  • Contextualisation: Mixing methods allows you to put findings in context and add richer detail to your conclusions. Using qualitative data to illustrate quantitative findings can help ‘put meat on the bones’ of your analysis.
  • Credibility: Using different methods to collect data on the same subject can make your results more credible. If the qualitative and quantitative data converge, this strengthens the validity of your conclusions. This process is called triangulation.