2021 Epi/Biostats Flashcards

Question 1

Q

Name 4 ways of dealing with confounding?

Answer

A

randomization 2. stratification 3. matching 4. regression modelling

Question 2

Q

Define Per-Protocol Analysis (PP) and Intention-to-Treat Analysis (ITT)

Answer

A

Per-Protocol Analysis (PP) Strategy of analysis in which only patients who complete the entire study are counted towards the results Intention-to-Treat Analysis (ITT) When groups are analyzed exactly as they existed upon randomization (i.e. using data from all patients, including those who did not complete the study)

Question 3

Q

What kind of study design is this? A study that examines the distribution of BMI by age in Ontario at a particular point in time.

Answer

A

cross-sectional

Question 4

Q

What’s the difference between Pearson and Spearman correlation?

Answer

A

Different types of correlation are used for different levels of measurement. Pearson is for continuous and Normal data, Spearman is for ordinal or non-Normal data. The Spearman correlation is the same as the Pearson correlation, but it is used on data from an ordinal scale The difference between the Pearson correlation and the Spearman correlation is that the Pearson is most appropriate for measurements taken from an interval scale, while the Spearman is more appropriate for measurements taken from ordinal scales. Examples of interval scales include “temperature in Farenheit” and “length in inches”, in which the individual units (1 deg F, 1 in) are meaningful. Things like “satisfaction scores” tend to of the ordinal type since while it is clear that “5 happiness” is happier than “3 happiness”, it is not clear whether you could give a meaningful interpretation of “1 unit of happiness”. But when you add up many measurements of the ordinal type, which is what you have in your case, you end up with a measurement which is really neither ordinal nor interval, and is difficult to interpret.

Question 5

Q

What statistical test will you use to compare the mean values of an outcome variable between two groups (e.g. difference in average BP between men and women)

Answer

A

Two-sample Z-Test

Question 6

Q

What statistical test will you use to test the correspondence between a theoretical frequency distribution and an observed frequency distribution (e.g. if one sample of 20 patients is 30% hypertensive and another comparison group of 25 patients is 60% hypertensive)

Answer

A

A chi-squared test determines if this variation is more than expected or due to chance alone

Question 7

Q

Define a Secondary Attack Rate and show how to calculate it?

Answer

A

• the proportion of individuals who develop disease as a result of exposure to primary contacts during the incubation period measure of infectiousness, which reflects the ease of disease transmission • = [(total # of cases - initial # of cases) / (# of susceptible individuals - initial # of cases)] * 100%

Question 8

Q

Investigators intend to study the causes of a rare cancer using a case-control study.

They plan to recruit cases from a national cancer registry, including those diagnosed within the last 3 years.

To avoid bias associated with interviewing proxies about case exposure histories, they decide to exclude all deceased cases, about 40% of the cases in the registry.

Discuss the advantages and disadvantages of this approach.

Answer

A

● Advantages: Avoids differential recall between proxies and cases. Both may be affected by recall bias, but proxies may be unaware of exposure histories during certain time periods (e.g. childhood, young adulthood). Reduce ethical issues that might occur is partners of deceased needed to be contacted.

● Disadvantages: likely excludes most severe disease, meaning cases are less representative of cases in the population. Exposure may impact risk of severe disease differently than less severe disease. Lose study power.

Question 9

Q

To evaluate a new back school, patients with lower back pain were randomly allocated to either the new school or to conventional occupational therapy. After 3 months they were questioned about their back pain, and observed lifting a weight by independent monitors.

What kind of study design is this?

Answer

A

Randomized controlled trial

Question 10

Q

To investigate the relationship between certain solvents and cancer, all employees at a factory were questioned about their exposure to an industrial solvent, and the amount and length of exposure measured. These subjects were regularly monitored, and after 10 years a copy of the death certificate for all those who had died was obtained.

What kind of study design is this?

Answer

A

Cohort study

Question 11

Q

You have received a request to develop a surveillance program focussed for COVID-19. List and briefly describe six attributes of a surveillance system as it would pertain to COVID-19?

Answer

A

Simplicity: the flow of information is simple (few providers, few systems, same IT structure, easy operation)
Flexibility: the system is able to adapt to changing information needs or operating conditions – new providers of information, new data requirements, new case definition etc
Data quality: complete and valid
Acceptability: willingness of persons and organizations to participate in the surveillance system
Sensitivity: proportion of cases detected by the system & ability to detect outbreaks and monitor trends
Specificity: proportion of cases reported that actually have the disease/event of interest
Timely: speed between steps
Stable: reliable (able to collect info without downtime) and available (accessible to users when they need to know)
Representative: represents health information trends by person, place, and time (trends)

Question 12

Q

Identify 5 mechanisms by which residual confounding can occur after attempts to control for confounding have been made in both the study design and analysis

Answer

A

Randomisation - too small of a sample or errors in randomising
Restriction & matching - not tight enough e.g. large age range where comparison groups end up with different age structures
Not all confounders were accounted for in the analysis because data on them was not collected
There were misclassification errors of confounders
Categorisation of confounders was not tight enough e.g. too large of age bands

Question 13

Q

What do you understand by active and passive surveillance? Provide an example of each.

Answer

A

Active Surveillance

Outreach such as visits or phone calls by the public health/surveillance authority to detect unreported cases
e.g. an infection control nurse goes to the ward and reviews temperature charts to see if any patient has
a nosocomial infection

Passive Surveillance

A surveillance system where the public health/surveillance authority depends on others to submit standardized forms or other means of reporting cases
e.g. ward staff notify infection control when new cases of nosocomial infections are discovered

Question 14

Q

Define standardization.

When do you use Direct Standardization and Indirect Standardization?

Answer

A

Definition: A statistical method used to calculating summary rates of health outcomes that are adjusted to take into account confounders (e.g. age). The standardized rates allow for a less distorted comparison between 2 + populations, showing how overall rates of disease/mortality compare if the 2+ populatoins hypothetically have the same distribution of confounder (e.g. same age distribution)

Direct Standardization: Used when age-specific rates of disease/mortality are known for the populations being compared.
Indirect Standardization: Used when it is difficult to obtain reliable estimates of age-specific rates due to small number of observations (therefore unstable rates, or random error)
*

Question 15

Q

List 3 ways to reduce interviewer bias

Answer

A

Use standardized questionnaires consisting of closed-end, easy to understand questions with appropriate response options.
Train all interviewers to adhere to the question and answer format strictly, with the same degree of questioning for both cases and controls.
Obtain data or verify data by examining pre-existing records (e.g., medical records or employment records) or assessing biomarkers.

Question 16

Q

List 5 ways to reduce loss to follow up

Answer

A

Enrolling motivated subjects
Using subjects who are easy to track
Making questionnaires/follow-up processes as easy to complete as possible
Maintaining the interest of participants and making them feel that the study is important
Providing incentives

Question 17

Q

List 4 general ways to reduce recall bias in a case-control study

Answer

A

Use a control group that has a different disease (that is unrelated to the disease under study).
Use questionnaires that are carefully constructed in order to maximize accuracy and completeness. Ask specific questions.
For socially sensitive questions, such as alcohol and drug use or sexual behaviors, use a self-administered questionnaire instead of an interviewer.
If possible, assess past exposures from biomarkers or from pre-existing records.

Question 18

Q

When do you use student’s T test?

Answer

A

1) . Used to compare the mean of a normally distributed population when the sample size n is small
e. g One sample t-test: Used tocompare the mean of a single group with small sample size against a known mean
e. g. Independent sample t-test compares the means for two groups with small sample size
e. g. Paired sample t-test compares the means from the same group with small sample size at different time (e.g. before and after)

Question 19

Q

When do you use chi-squared test?

Answer

A

Pearson’s chi-squared test is used to analyze categorical data to determine whether there is a statistically significant difference between the expected frequenceis and the observed frequencies in on ore more categories of a contigency table

Question 20

Q

List 3 assumptions of Cox Proportional Hazards analysis

Answer

A

Independence of survival times between distinct individuals in the sample
A multiplicative relationship between the predictors and the hazard
A constant hazard ratio over time

Question 21

Q

List 4 assumptions of linear regression modelling

Answer

A

Linearity: The relationship between X and the mean of Y is linear.
Homoscedasticity: The variance of residual is the same for any value of X.
Independence: Observations are independent of each other.
Normality: For any fixed value of X, Y is normally distributed.

Question 22

Q

List 5 assumptions of logistic regression

Answer

A

Assumption of independence of observations
Assumption of the absence of multicollinearity
Assumption of linearity of independent variables and log odds
Assumption of large sample size

Question 23

Q

A recently published cohort study found that the relative risk of acute myocardial infarction in blood donors compared to non-donors was 0.14 (95% confidence interval 0.02 to 0.97; p=0.047).

a) With reference to cohort studies, define what is meant by the term ‘Person-time’
b) When reporting the results of epidemiological studies, why are confidence intervals preferred to p-values?
c) Interpret the values given above for the relative risk and the 95% confidence interval

Answer

A

a) A measurement combining (i.e. adding) persons and follow-up time as the denominator in the calculation of incidence/mortality rates when individual subjects are at risk of developing disease for varying time periods (i.e. a disease(s)of interest) or dying
b) CIs are preferable to p-values as they provide the range of possible effect sizes around the measure of impact (incidence/prevalence) or association (RR, OR) where there is a 95% probability that the true value of impact/association lies
c) RR - Blood donors had 86% lower risk of MI. RR of <1 suggests a protective effect or negative association

95% CI - if the study was repeated 100 times on the same population then 95 times the true value would lie within the 95% CI. The wide CI suggests a smaller sample size or population with large variation

Question 24

Q

A) List 5 key criteria for population-based screening and provide a brief comment on how it applies to abdominal aortic aneurysm screening in Canada

Answer

A

Condition is an important public health issue - AAA is a significant and preventable cause of morbidity and mortality in older males
Screening test is acceptable - abdominal ultrasound is a non-invasive test that is widely available
Natural history is known - the natural history of AAAs is understood including rate of growth according to size
There is an agree policy on who to treat - there are algorithms developed to determine relevant treatment and monitoring options according to size and rate of change of AAA
There is a defined population - in Canada, males aged 65-80 are recommended a one time abdominal ultrasound

Question 25

Q

You are conducting a population health survey for your regional health authority

List FIVE strategies to maximise the response rate of your survey

Answer

A

Keep survey as short as possible
Clear layout and careful design of questionnaire
Pilot the questionnaire first to identify any issues with usability or comprehension
Use appropriately timed follow-up reminders to complete the survey
Use simple language, short sentences etc.
Personalised covering letter/email conveying the reasons for the survey and its value
Clear statements of confidentiality
Consider use of telephone and web based administration of survey – but may introduce biases
Ensure that written materials are available in appropriate languages
Offer help to specific groups (e.g. elderly, blind, poor literacy skills)

Question 26

Q

What is small study bias wrt meta-analyses

Answer

A

Is the tendency for the smaller studies in a meta-analysis to show larger treatment effects

Question 27

Q

Systematic Review
List and explain 5 criteria that you would use to evaluate the quality of a new systematic review on the health harms of e-cigarette use in adults.

Answer

A

Did the review questions and inclusion criteria define the population, intervention, and outcome
Did the review use a comprehensive search strategy and provide appropriate details?
Were the methods established prior to conducting the review?
Did the review provide a list of included and excluded studies with justification and adequate details?
Did the review authors use a satisfactory technique to assess the risk of bias in included studies?

Question 28

Q

Qualitative Methods
Increasing rates of syphillis have been reported in many MSM populations in Canada. Describe 3 qualitative research data collection strategies and provide examples which might be used to try to identify reasons for increasing rates of unprotected sex in this group.

Answer

A

group interviews - ex. focus group. Group of people are asked about their perceptions, opinions, beliefs, in an interactive group setting where participants are free to talk with other group members.
Example - interview with a group of MSM individuals to discuss barriers to protection during sex
individual interviews - ex. Key informant interview
in depth 1:1 conversations
Example: in depth discussions in 1:1 setting with single cases
Ethnographic (ex. observation) - researchers interact with participants in real life settings to understand behaviour drivers.
Example: researchers go to nightclubs and observe how individuals approach meeting new partners and availability of safer sex educational material and products (i.e. accessibility of condoms)

Question 29

Q

Qualitative Methods
List three types of qualitative research data collection strategies and advantages and disadvantages of each one

Answer

A

1) group interviews - ex. focus group.- group of people are asked about their perceptions, opinions, beliefs, in an interactive group setting where participants are free to talk with other group members.
Advantage: individuals can interact with another during the setting and generate group based insights
Disadvantage: Social desirability (individuals expressed views may differ from actual views due to presence of other and pressure to conform)

2) individual interviews - ex. Key informant interview
Advantages:

Disadvantage:
One of the main criticisms is that the data collected cannot necessarily be generalised to the wider population.

3) Ethnographic (ex. observation) - researchers interact with participants in real life settings to understand behaviour drivers.
Advantage: Can help identify unexpected issues that the researchers did not know to ask about
Disadvantage: costly time intensive, and in certain circumstances not feasible for ethical or privacy reasons

Question 30

Q

What are 6 steps of a systematic review

Answer

A

Framing the Question

Develop a study Protocol

Identifying the Relevant Work

Assessing the Quality of Evidence

Summarizing the Findings

Interpreting the Evidence

https://www.lib.uwo.ca/tutorials/how_to_perform_a_systematic_review/index.html

Question 31

Q

You are a provincial health consultant who is in charge of a new screening program for a cancer. What are 5 adverse effects that can occur from screening?

Answer

A

False Positives: individual tests positive without disease and experiences further invasive procedures or treatment
False Negatives: Individuals tests negative for disease even though they have it potentially resulting in delayed care or interventions
Stigma: individuals who screen positive may experience stigma due to nature of condition
Over-diagnosis: individuals screen positive for the disease or condition but would never have experienced symptoms or premature mortality from it
Increased health inequities: participation and benefits from the screening program are not distributed in an equitable manner resulting in widening disparities.

Question 32

Q

List three types of bias when evaluating a screening program that may result in the appearance of increased benefits

Answer

A

Lead time bias - individuals identified through screening program are found earlier in disease course that those diagnosed without screening but do not survive longer from disease onset

Length time bias - screening is more likely to detect individuals with a slowly progressing from of disease compared to more aggressive disease

Selection bias - individuals who participate in screening program may be healthier, better lifestyles, and more likely to adhere to therapy.

Question 33

Q

Define a cross sectional study design

Answer

A

An observational study where exposure and outcomes are assessed at a single point in time.

Question 34

Q

List 3 strengths and 3 weaknesses of cross sectional study design

Answer

A

Strengths:

Inexpensive
Quick
Can assess multiple exposures and outcomes
Can calculate prevalence

Weaknesses:

Cannot assess time trends
Cannot assess causation
Difficult to assess rare outcomes

Question 35

Q

What is the key difference between a cohort and case control study

Answer

A

In a case control study the outcome is known and the exposure status between cases and non cases is investigated, in a cohort study the exposure is known and the outcomes in the exposed and non exposed groups are investigated

Question 36

Q

When can an odds ratio be considered equivalent to a relative risk

Answer

A

When the outcome is relative rare in the investigation (<20%)

If the odds ratio is interpreted as a relative risk it will always overstate any effect size: the odds ratio is smaller than the relative risk for odds ratios of less than one, and bigger than the relative risk for odds ratios of greater than one

Question 37

Q

A recent study examined the association between avoidable mortality and neighbourhood marginalization. The study took neighbourhood marginalization (a continuous variable) and divided it into quintiles. What are
3 limitations and 3 benefits of this approach

Answer

A

Limitations
- reduces ability to compare results between studies in different settings as quintiles may not represent the same exposure in different regions (analagous to comparing two studies on the association between poverty and health, when poverty was defined as an annual income of $22,000 in one and $47,000 in the other)

increases liklihood of false positive due to multiple testing
assumption that risk is homegenous within each category (i.e. within lowest quintile bottom 5 percent might be very different than next 15 percent)

Advantages
- Facilitates communication to the lay public and decision makers

Can be easily used to divide groups into levels of risk
with a relative risk for each group
facilitates interpretation of statistical interaction (effect measure modification) tests (interactions between continuous variables are difficult to interpret)

Question 38

Q

What is the difference between direct and indirect standardization?

What data do you need to do direct standardization?

What data do you need to do indirect standardization?

Answer

A

Direct Standardization

Uses the age structure of a reference population and the known event rates by age or sex in two populations to create standardized rates
need the number of events and the number of individuals in the population for multiple age ranges in two separate populations

Indirect Standardization

Uses the known age specific rates of a reference population to calculate the expected overall number of cases in the population of interest
need to age specific reference rates and the observed number of cases / number of individuals in a population for 1 population

Question 39

Q

What are essential components of a surveillance system?

Answer

A

– What is the population under surveillance?
– What is the period of time of the data collection?
– What data are collected and how are they collected?
– What are the reporting sources of data for the system?
– How are the system’s data managed (e.g., the transfer, entry, editing, storage, and back up of data)? Does the system comply with applicable standards for data formats and coding schemes? If not, why?
– How are the system’s data analyzed and disseminated?
– What policies and procedures are in place to ensure patient privacy, data confidentiality, and system security? What is the policy and procedure for releasing data? Do these procedures comply with applicable federal and state statutes and regulations? If not, why?
– Does the system comply with an applicable records management program? For example, are the system’s records properly archived and/or disposed of?

Question 40

Q

How do you calculate a attributable fraction (attributable risk percent)

If the a study find a an AR of 30% between lung cancer and radon how would you interpret this finding?

Answer

A

AR = Incidence exposed - incidence unexposed / incidence exposed *100

30% of lung cancer cases in individuals exposed to radon are attributable to the radon exposure

Question 41

Q

How do you calculate the population attributable risk

If the a study find a PAR of 0.02 between lung cancer and radon how would you interpret this finding?

Answer

A

Incidence total population - incidence unexposed

For every 1000 individuals in this population 20 will develop lung cancer due to exposure to radon.

Question 42

Q

How do you calculate the population attributable risk percent (population attributable fraction)

If the a study find a PAF of 10% between lung cancer and radon how would you interpret this finding?

Answer

A

Incidence total population - incidence unexposed / incidence total population *100

10% of the lung cancer in this population are due to exposure to radon.

Question 43

Q

Define Attributable Risk

Answer

A

The number of cases of disease in the exposed population that can be attributed to the exposure

Question 44

Q

You have completed a case - control study examining the association between exposure to radon and lung cancer. In order to calculate a population attributable fraction from this study what assumption must you make?

Answer

A

Must assume that the prevalence of exposure in controls representative of the prevalence of exposure in the total population.
Outcome of interest should be rare in order for OR to approximate in the total population.

Question 45

Q

Calculate a population attributable fraction from radon and lung cancer given an odds ratio of 3.5 of observing radon in cases vs controls and a prevalence of exposure to radon of 30% in the cases and 8.5% in the controls.

Answer

A

(OR-1/OR)* prevalence of exposure in cases

Brainscape's Knowledge GenomeTM

2021 Epi/Biostats Flashcards

Brainscape's Knowledge Genome^TM