Epidemiology & Biostats Flashcards

(106 cards)

1
Q

The specific morbidity rate is usually the number of:
a. Cases of a specific disease per 1,000,000 population
b. Deaths from a specific disease for a geographical area
c. Cases of a specific disease for a political area
d. Deaths from a specific disease per 100,000 population
e. Deaths from a specific disease per 100 cases of that disease

A

A. Morbidity relates to who gets sick from an illness. The denominator can be persons if the time period is specified, or person-years.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

For the following 2x2 table for a diagnostic test, write out expressions for each term:
Disease + Disease -
Exposure + A B
Exposure - C D

a. Specificity

b. PPV

c. Total population

d. NPV

e. Prevalence

f. Sensitivity

A

a. Specificity: d/(b+d)

b. PPV: a/(a+b)

c. Total population: a+b+c+d

d. NPV: d/(c+d)

e. Prevalence: (a+c)/(a+b+c+d)

f. Sensitivity: a/(a+c)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The amount of health disorder existing in a population at one particular time, regardless of time of onset is known at the:

a. Prevalence
b. Incidence
c. Morbidity rate
d. Mortality rate
e. Attack rate

A

a. Prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In prospective study or cohort type of epidemiologic study, two types of cohorts are selected. One of these is the exposed and the other is______; the measure of effect used in this study is _________
a. Cases; odds ratio
b. Susceptible; risk ratio
c. Affected ; relative risk
d. Non-exposed; incident rate ratio
e. Immune populations; odds ratio

A

D. Non-exposed; incident rate ratio

Textbook definition of a cohort study – following exposed and non-exposed over time to see who develops the disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The duration of a chronic disease process may complicate the epidemiologic study of its prevalence because of:
a. Loss of people or animals from the study by death from other causes
b. Changes in diagnostic techniques during the period of study
c. Changes in medical or veterinary care during the period of study
d. Decrease in interest level on the part of workers in the study
e. All of the above

A

E. All of the above

Pretty common sense, but in essence, if you want to know who is infected in a population, drop-out, changes in how a disease is diagnosed or treated can change your case definitions, your degree of ascertaiment of cases, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When an epidemiologist is called to investigate a communicable disease emergency, the first thing he/she should try to determine is:
a. Possible sources of infection
b. Methods of transmission
c. Accuracy of the diagnosis
d. Methods of control
e. Extent of spread

A

C. Accuracy of the diagnosis

Making sure you know what disease you’re dealing with is the first step needed before asking any other questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The occurrence in a community or region of cases of an illness in the human population clearly in excess of normal expectancy and derived from a common propagated source is an:
a. Epidemic
b. Endemic
c. Pandemic
d. Epizootic
e. Anthropozoonosis

A

A. Epidemic

This is the textbook definition of an epidemic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following agent characteristics is most likely to be seen in a disease which occurs in epidemic proportions:
a. High infectivity
b. High pathogenicity
c. High virulence
d. Low antigenicity
e. Viability

A

A. High infectivity

For this question, I return to my definition of the R0, which is the average number of new cases of an infection caused by one typical infected individual, in a population consisting of susceptibles only. When R0>1, an epidemic occurs. R0 depends on the transmissibility or infectivity of the agent, the contact rate between hosts, and the time spent infectious. So if the infectivity of the agent increases, you’re more likely to have an epidemic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a study of alcohol and oral cancer the relative risk is 2.0 for men and 2.0 for women but 4.0 for both sexes combined. This suggests that:

a. There is confounding by sex in these data
b. There is confounding by some unknown or unmeasured factor in these data
c. There is evidence of effect modification in these data
d. The results have been adjusted for age and sex
e. The results are due to bias

A

A. There is confounding by sex in these data

General rule of thumb is that when you stratify by your variable of interest, it will be a confounder if both stratified effect estimates are similar and more than 10-15% different from the crude estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A new treatment is developed that prevents death but does not produce recovery from disease. Which of the following will occur?

a. Prevalence will increase
b. Prevalence will decrease
c. Incidence will increase
d. Incidence will decrease
e. None of the above

A

A. Prevalence will increase

Think of prevalence as a water in a bucket. Prevalence increases when water (i.e., people with the disease) are added to the bucket. Prevalence decreases when there is a hole in the bucket and water is leaving the bucket (i.e., people are recoverying or dying and leaving the population). If more cases keep arriving but there are no departures from the population, the prevalence will increase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In a country where a disease is endemic:

a) The number of affected animals tends to stay more or less constant over time

b) There have been at least 2 outbreaks of that disease in the past 5 years

c) The disease has persisted in that population for a long time

d) The vaccine for that disease is probably not used in a widespread manner

A

C. The disease has persisted in that population for a long time

The definition of an endemic is one that is regularly found among particular populations or in a certain area – so nothing about numbers or vaccination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

As a dairy practitioner, you read with great interest a recent paper describing a clinical trial testing a new drug to treat mastitis. This drug, called Masticate™, is touted as costing half the price and being easier to administer than most other therapies for mastitis. The paper tested this drug against the current standard of care, and the authors found no difference in cure rates. You decide to try it in your herd. Six months later, you find that Masticate™ is actually less effective than the drug you used before – whereas before, your cure rate was around 80%, now your cure rate is closer to 65%. Which of the following is the LEAST likely possible explanation for the discrepancies between your experience and the findings reported in the paper?

a) The sample size in the original paper was small, and therefore the study was underpowered to detect the difference you found.

b) The authors of the paper defined a successful cure differently than you did.

c) The animals enrolled in the study were primiparous cows only; your herd has a mix of different aged cows, and the results may therefore not have been generalizable to your herd.

d) The authors were not blinded to the treatment and therefore could have scored the cows receiving Masticate™ more generously.

e) The batch of drugs you used was defective.

A

E. The batch of drugs you used was defective.

All of the other options are very reasonable explanations of why two “studies” (i.e., the published paper and your experimental trial) would have found different answers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A case control study compared the amount of daily coffee drank by patients with pancreatic cancer (cases) and patients with other GI conditions (controls). The study found a dose-response association between drinking coffee and pancreatic cancer that persisted when adjusting for cigarette smoking. What is the most likely explanation for the findings of the study? For bonus points, provide an explanation for why.

a) A true association – drinking coffee causes pancreatic cancer (yikes!)

b) Information bias

c) Selection bias

d) Confounding

A
  1. C. Selection bias

i.e., who gets into or stays in the study. This is a historical example so you may have heard of it, but you can come up with a likely explanation. Because controls often had GI issues such as esophagitis, ulcers, etc., they self-limited coffee consumption. Their coffee consumption was lower than that of the general population, so it appeared that cases drank more coffee. The controls were not representative of the general population to which we would like to extrapolate our findings, so we have an issue of selection bias here. D. Confounding is also a possibility – we controlled for smoking but there could be other unmeasured confounders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In a country with a population of 6 million people, 60,000 deaths occurred during the previous year. These included 30,000 deaths from cholera in 100,000 people who were sick with cholera.

What was the cause-specific mortality rate from cholera during the previous year?

a. 5%

b. 10%

c. 50%

d. 5 per 1000

e. 10 per 1000

A

D. 5 per 1000

Cause-specific mortality rate per 1,000 population = # of deaths from that cause/# of people in the population x 1000 = 30,000 cholera deaths/6 million population x 1000 = 5 per 1000.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a country with a population of 6 million people, 60,000 deaths occurred during the previous year. These included 30,000 deaths from cholera in 100,000 people who were sick with cholera.

What was the case-fatality from cholera in the previous year?

a. 1%

b. 5%

c. 10%

d. 30%

e. 50%

A

D. 30%. Case fatality rate = # dead from the disease/# with the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which one of the following frustrations would you most likely expect in preparing to carry out cohort studies on animal disease?
a. Costly and time consuming, and plagued by the continual changing of the cohort.
b. Difficult time in selecting a comparison group or control population upon which to test your hypothesis.
c. Cohort populations are unchanging, and that no new individuals are introduced into the study population.
d. Data collected retrospectively is often incomplete, and plagued by high degrees of institutional bias.
e. Unable to get accurate estimates of incidence or prevalence of the disease using the cohort study technique.

A

A. Cohort studies, especially prospective cohort studies, tend to be more expensive, and theytake longer to conduct because you’re following forward in time. I would argue D is also anacceptable answer, as investigators using retrospective data have much less control over thecohort and less ability to be confident in the completeness of their data. Answers b and e tendto apply more to case-control studies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

One of your clients has a feedlot containing 15,000 cattle, 10,000 of which are susceptible. In a current outbreak of disease, 3,000 became sick and 300 died. The case fatality rate was:
a. 10%
b. 25%
c. 2%
d. 30%
e. 3%

A

A. CFR=the proportion of animals that die from a specified disease among all individuals diagnosed with the disease over a certain period of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What study plan would be best to determine the effectiveness of a new vaccine in preventing disease in humans?
a. Case-control study.
b. Cohort study.
c. Prevalence study.
d. Morbidity study.
e. Retrospective study.

A

B. For an observational study, you want to compare outcomes among who was exposed, i.e.,vaccinated) and unexposed (non-vaccinated). For an experimental study, a randomized controltrial would be better!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A certain causal factor is thought to be associated with an extremely rare disease. What study plan would yield the best data with limited financial and human resources?
a. Prevalence study.
b. Case-control study.
c. Prevalence study.
d. Morbidity study.
e. Case evaluation study

A

B. Case control studies are better for rare diseases, because you don’t have to wait for cases of arare disease to accumulat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Match the following terms to their definitions:

A. A causal factor that is neither necessary nor sufficient, but increases the likelihood of disease, all other things being equal.

B. Any factor that must be present for the disease to occur.

C. Any factor or, more commonly a constellation of factors, that inevitably lead to the disease

i. Sufficient cause
ii. Necessary cause
iii. Contributing cause

A

A. contributing
B. Necessary
C. Sufficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The measure most sensitive to extremes is:
a. Mean
b. Median
c. Mode
d. Sample
e. Inferential

A

A. The mean is most susceptible to outliers. That is why when we have non-normally distributedor skewed data, it is more appropriate to present the median when performing descriptivestatistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Generalizability is best assured by:
a. Representative nature
b. Randomness
c. Sample size
d. Precise manipulation
e. Statistical validity

A

A. Generalizability indicates how well your results are likely to apply to other populations. If your sample population is representative of other populations, then it is likely your results are generalizable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which of the following is NOT associated with a retrospective study?
a. Adaptable to conditions of low prevalence.
b. Less expensive than prospective.
c. Requires fewer personnel.
d. Takes longer to conduct.
e. Provides less accurate incidence rate.

A

D. If your data are retrospective (i.e., already collected), then you take out the time factor ofhaving to follow your population over time and accumulate cases with your desired outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

An epidemic curve displays:
a. The population at risk versus the frequency of cases.
b. The frequency of cases versus the number of ill in the population.
c. The time of onset versus the population at risk.
d. The time of onset versus the frequency of incident cases.
e. The time of onset versus the number of individuals who are ill.

A

D. An “epidemic curve” shows the frequency of new cases over time based on the date of onsetof disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
A decrease in the prevalence of a disease could be interpreted as a result of: a. A reduction in the incidence. b. A more rapid cure. c. A shorter life span of affected individuals. d. “a” and “c” above. e. All of the above.
E. Again, if you consider the analogy of water in a bucket, with more water (disease individuals)entering the bucket and some leaving through a hole in the bucket, the total amount of water inthe bucket (prevalence) can decrease if the incidence is lower (less water coming in), or if moreare leaving the bucket through the hole (either by dying or becoming recovered).
26
In the hierarchy of Scientific Evidence, what types of study provide the highest levels of evidence? a) Cohort studies b) Cross sectional studies c) Randomized clinical trials d) Meta-analyses and systematic reviews
The conventional wisdom is that SR/MAs provide the highest level of evidence, because you aresummarizing all of the available evidence. However, keep in mind that with a SR/MA, it is“garbage in, garbage out”, so a SR/MA is only as good as the studies that go into it
27
On a poultry farm, all of the birds are checked every day by their keepers for signs of disease and mortality. When farmers find a sick or dead bird, they alert veterinary services and can send the bird in for examination/autopsy and infectious disease testing. This is an example of: a) Active surveillance b) Passive surveillance c) Targeted passive surveillance d) Sentinel surveillance
Passive surveillance is when animals/people come to our attention becausethey are suspected of being cases – i.e.,a system by which a health jurisdictionreceives reports submitted from hospitals, clinics, public health units, or other sources. Sincethe lab is waiting for these dead birds to be sent to them by the farmer, this is passivesurveillance.
28
An investigator is performing a study to examine the effect of being overweight on experiencing CCL tears in dogs. The investigator also wants to control for age and spay/neuter status. What statistical technique does the investigator need to use to test her hypothesis? a. Chi-square test b. T-test c. Linear regression d. Logistic regression e. Ordinal regression
D. Logistic regression – since the outcome is categorical (injury or no injury)and multiple predictors are being examined, logistic regression is needed.Chi-square test could be used if we were only looking at effect of beingoverweight on CCL tears, but since we want to adjust for other factors, weneed a multivariable model.
29
Two veterinarians want to investigate a new laboratory test that identifies streptococcal infections. Dr. Kidd uses the standard culture test, which has a sensitivity of 90% and a specificity of 96%. Dr. Lamb uses the new test, which is 96% sensitive and 96% specific. If 200 animals undergo culture with both tests, which of the following is correct? a. Dr. Kidd will correctly identify more animals with streptococcal infection than Dr. Lamb b. Dr. Kidd will correctly identify fewer animals with streptococcal infection than Dr. Lamb c. Dr. Kidd will correctly identify more animals without streptococcal infection than Dr. Lamb d. The prevalence of streptococcal infection is needed to determine which vet will correctly identify the larger number of animals with the disease
B. Sensitivity is how well a test can detect a case of the disease. Lower sensitivity means thatthe test will pick up, or correctly identify, fewer cases of the disease
30
In a colon cancer screening study, individuals 50 to 75 years old are screened with the Hemoccult test. In this test, a stool sample is tested for the presence of blood. If the Hemoccult test result is negative, no further testing is done. If the Hemoccult test result is positive, the individual will have a second stool sample tested with the Hemoccult II test. If this second sample also tests positive for blood, the individual will be referred for more extensive evaluation. What is this type of screening called, and what is the effect on net sensitivity and net specificity of this method of screening? a. Serial testing; Net sensitivity and net specificity are both increased b. Serial testing; Net sensitivity is decreased and net specific­ity is increased c. Parallel testing; Net sensitivity remains the same and net specificity is increased d. Parallel testing; Net sensitivity is increased and net specificity is decreased e. Parallel testing; Net sensitivity is decreased and net specific­ity is increased
B. This is serial testing, which reduces sensitivity but increases specificity. This is because, in series testing, it is harder to be a “true” negative (you need to have two tests be positive to say you have the disease), but it is easier to be a “true” negative (only need one negative result tobe called disease-free)
31
A diagnostic test has been introduced that will detect a certain disease 1 year earlier than it is usually detected. Which of the following is most likely to happen to the disease within the 10 years after the test is introduced? (Assume that early detection has no effect on the natural history of the disease. Also assume that no changes in death certification practices occur during the 10 years.) a. The period prevalence rate will decrease b. The apparent 5-year survival will increase c. The age-adjusted mortality rate will decrease d. The age-adjusted mortality rate will increase e. The incidence rate will decrease
B. This should be fairly intuitive. If the detecting the disease earlier does not change the course of the disease, then it will just look like people are “surviving” longer.
32
Investigators enrolled 100 diabetics without eye disease in a cohort study. The results of the first three years are as follows: Year 1: 0 cases of eye disease detected: 8 lost to follow-up Year 2: 2 new cases of eye disease detected; 2 patients died; 10 lost to follow-up Year 3: 3 new cases of eye disease detected; 2 more patients died; 13 more lost to follow-up. The person-time incidence rate is: a. 5/100 b. 5/63 c. 5/235 d. 5/250
D. The person-time incidence rate is the number of new cases divided by the total person-time at risk. What’s important to remember is that it is often assumed that someone who didn’t make it the full year (i.e., was lost to follow-up or was diagnosed or died during the year is assumed to have contributed half of the observed time).Ideally we would have much more precise records where we would know exactly when a person became ill or dropped out, but when we don’t we make this middle-ground assumption. So here’s the breakdown for this particular study: Year 1: Initial cohort: 100 individuals. 8 lost to follow-up during Year 1 (contributed 8 x 0.5=4). Person-time contributed: 100-8+4=96 person-years. Year 2: 92 people in the cohort at the beginning of the year. 2 deaths, 2 diseased, and 10 lost to follow-up during Year 2, so 92-14=78 people contributed a full year and 14 people contributed half a year: person-time = 78+7=85 person-years Year 3: Remaining cohort at the beginning of Year 2: 78 4 deaths, 3 diseased, and 11 lost to follow up during year 3, so 78-18=60 people contributed a full-year and 18 contributed half a year=60+9=69. Total person-time contributed: 96+85+69- = 250 # of cases: 5 Incidence=5/250
33
Define the terms “sensitivity” and “specificity” as they relate to diagnostic test interpretation.
Sensitivity and Specificity are conditional probabilities that describe the performance of a diagnostic test. Sensitivity is the probability that a positive test result will be obtained, given the condition that the individual tested has the disease. Specificity is the probability that a negative test result will be obtained, given the condition that the individual tested does not have the disease
34
Explain why it would, or would not, be appropriate to estimate “relative risk” from atypical cohort study.
Relative risk (i.e., risk ratio) is the appropriate measure of association used in cohort studies. Relative risk is calculated as: (risk of disease in exposed group)/(risk of disease in unexposed group). Risk is defined as the number of new cases that occur during a specific time period in a specific population (e.g., exposed or unexposed).Cohort studies follow exposed and unexposed individuals over time to determine who develops the outcome of interest. Because of this, incidence (and risk) of the Week 1 Practice SME Answer Key2013 ACVPM Epi. Exam outcome of interest can be estimated, making relative risk an appropriate measure of association for cohort studies
35
In general, what does it mean when there is a significant interaction between two predictor variables in a multivariable regression model?
In a multivariable regression model, a statistical interaction exists betweentwo explanatory (i.e., predictor or independent) variables when the effect ofthe each variable on the outcome depends on the effect of the other variable.In other words, the effect of each variable on the outcome is not independentof the other variable. The combined effect of the two variables differs from the sum of the individual effects of each variable.
36
In a case-control study to investigate atresia coli in cattle, it was found that Holstein-Friesian calves were significantly more likely to have atresia coli than all other breeds combined. a) What statistical test was used to make this determination? b) What is the null hypothesis for this particular test (you can use mathematical notation or English)?
a) Chi-square b) No association between breed of cattle and atresia coli
37
Women who had been exposed to a pesticide, DDE, were followed for 20 years. At the start of the study period, the women completed a questionnaire and had blood drawn, and the women were classified as having either low dose or high dose exposure. Of the 792 women who had high dose exposure to DDE, 430 were later diagnosed with breast cancer. Of the 3,525 women with low dose exposure to DDE,1,079 were later diagnosed with breast cancer. a) What type of study design is this? b) What is the cumulative incidence of breast cancer for those exposed to high doses of DDE? c) What measure of association is appropriate to calculate for this study design? d) Please calculate that measure of association. Please show all calculations, including the 2-by-2 table. e) Interpret the calculated measure of association. f) Please calculate the attributable risk percent. g) Interpret the attributable risk percent.
a) Cohort study b) 430 / 792 = 54% c) Relative risk (RR) d) Present Absent Total Risk High 430 362 792 430/792 = 0.54 Low 1079 2446 3525 1079/3525 = 0.31 RR = 0.54/0.31 = 1.74 e) Women who had high dose exposure to DDE had 1.74 times the risk of developing breast cancer than did women who had low dose exposure to DDE. f) You will remember that the attributable risk percent (attributable fraction among the exposed) is calculated by Risk exposed– Risk unexposed) / Risk exposed which can be reduced to (RR – 1) / RR or (OR – 1) / OR(1.74 – 1) / 1.74 = 43.5%g) Interpret the attributable risk percent. If high dose exposure to DDE were prevented in the group of women exposed to high dose DDE exposure, we would prevent at most 43.5% of the breast cancer cases in that group. g) If high dose exposure to DDE were prevented in the group of women exposed to high dose DDE exposure, we would prevent at most 43.5% of the breast cancer cases in that group.
38
What are three ways to control for confounding in epidemiologic studies?
Randomization Restriction Matching Stratification Multivariate analysis
39
The following question is on a questionnaire designed to investigate the effect of coffee consumption on cardiovascular disease. How much coffee do you drink? i. 1 cup ii. 2-3 cups iii. 3-5 cups iv. More than 5 cups a) Describe two things that are wrong with this question and its available answers.
No timeframe (per day, per week, etc.)Not all possible options for answers are listed (“none” is not an option, etc.) Categories are not mutually exclusive “Cup” not defined
40
What is the ecologic fallacy?
Ascribing characteristics and associations demonstrated at the group level to individuals
41
Fifteen hundred adult males working for Lockheed Aircraft were first examined in1951 and were classified by diagnosis criteria for coronary artery disease. Every 3 years they were examined for new cases of this disease; attack rates in different subgroups were computed annually. This is an example of a: a) Cross-sectional study b) Prospective cohort study c) Retrospective cohort study d) Ecologic study e) Case-control study
b) Prospective cohort study
42
Which of the following is not an advantage of a prospective cohort study? a) Incidence rates can be calculated b) Precise measurement of exposure is possible c) Recall bias is minimized compared with a case-control study d) Many disease outcomes can be studied simultaneously e) It usually costs less than a case-control study
e) It usually costs less than a case-control study
43
One hundred patients with infectious hepatitis and 100 matched neighborhood wellcontrols were questioned regarding a history of eating raw calms or oysters withinthe preceding 3 months. What kind of study design is this? a) Cross-sectional study b) Prospective cohort study c) Retrospective cohort study d) Ecologic study e) Case-control study
e) Case-control study
44
All of the following are important criteria when making causal inferences except: a) Replication of findings b) Temporal relationship c) Null hypothesis d) Strength of association e) Biologic plausibility
c) Null hypothesis
45
Geographic variations were determined in the incidence of inflammatory bowel disease (IBD). Incidence of IBD was observed highest in areas with higher socioeconomic status, the lowest rates of enteric infection, and with the highest rates of multiple sclerosis. This is an example of a: a) Cross-sectional study b) Prospective cohort study c) Retrospective cohort study d) Ecologic study e) Case-control study
d) Ecologic study
46
A case-control study is characterized by all of the following except: a) Study participants are selected based on disease status b) Assessment of past exposure may be biased c) It is relatively inexpensive compared with most other epidemiologic study designs d) Incidence rates may be computed directly e) Definition of cases may be difficult
d) Incidence rates may be computed directly
47
You are evaluating a new diagnostic test by comparing it to a gold standard. a. What is the sensitivity of the test? b. What is the specificity of the test? c. What is the predictive value positive of the test? d. What is the predictive value negative of the test? e. What is the prevalence of disease in this example (based on the results of the gold standard test)? f. What happens to the predictive value positive if the prevalence decreases?
a. Sensitivity = TP / Total with disease = 260 / 325 = 80% b. Specificity = TN / Total without disease = 1640 / 1735 = 95% c. PVP = TP / Total positives = 260 / 355 = 73% d. PVN = TN / Total negatives = 1640 / 1705 = 96% e. 325 / (325 + 1735) = 15.8% f. If prevalence decreases then PVP decreases
48
Epidemiologic models can be useful for all of the following except: a. Predicting effectiveness of programs b. Organizing and storing knowledge about a disease process c. Predicting risk or consequences of disease d. Identifying an individual’s risk factors for disease e. Developing policy
d. Identifying an individual’s risk factors for disease
49
To be effective, surveillance systems should incorporate all of the followingexcept: a. Generation of information for action b. Disease eradication c. Ongoing data collection d. Systematic data collection e. Timely information dissemination
b. Disease eradication
50
Surveillance system design should aim at a. Maximizing the probability of true early detection b. Incorporating as many sampling architectures as possible c. Minimizing the probability of a false-positive alarm d. All of the above e. a and c only
e. a and c only
51
It is important if treatment at the pre-symptomatic stage has a more favorable outcome than treatment initiated once the patient is symptomatic. T/F
True
52
The lead time is defined as the period in the natural history of the disease in which treatment is more effective and/or less difficult to administer. T/F
False The lead time is defined as the interval by which the time of diagnosis is advanced by early detection of disease through screening compared with the usual time of diagnosis. The critical point in the natural history of a disease is the point before which treatment is more effective and/or less difficult to administer.
53
In order for a screening program to be effective, there does not need to be an accepted treatment for patients identified with the disease. T/F
False
54
If α is our false-positive error rate, or the probability of making a Type I error, and β is our false negative error rate, or the probability of making a Type II error, what is power?
Power is 1 – β, the probability of detecting a difference if one truly exists
55
For a given α and measure of association, how can an investigator increase the power of a study?
Increase the sample size
56
Probability sampling a. Refers to several sampling strategies b. Allows investigators to generalize the results from the sample to the population c. Allows calculation of the standard error of the resulting population estimates d. All of the above e. a and b only
d. All of the above
57
A true lack of association may be difficult or impossible to distinguish from a true association that cannot be detected statistically because of inadequate _________. a. α (alpha) b. β (beta) c. Power d. Detection rates e. Error rates
c. Power
58
The use of a Geographic Information System (GIS) allows an investigator to assessall of the followingexcept: a. The cohort effect b. Whether there is a spatial pattern c. Whether patterns co-distribute d. If risk factors differ with location e. How disease spreads
a. The cohort effect
59
In a case control study to assess whether oral contraceptive use is associated withhaving a myocardial infarction, the crude odds ratio (OR) is determined to be 4.8.The investigators are concerned that smoking status might be a confounder or effectmodifier, and so they stratify their analysis by smoking status. For smokers, the ORis 6.0, while for nonsmokers, the OR is 3.0; the Mantel-Haenszel (adjusted) OR is 4.7. a. Is this an example of confounding, effect modification, both, or neither? b. Which measure(s) of effect do the investigators report? Please include the numeric value(s) in your response. c. Please interpret what you report
a. Effect modification b. Report stratum specific ORs: for smokers OR = 6.0 and for nonsmokers OR = 3.0 c. MI cases who were smokers had 6 times the odds of having used oralcontraceptives than controls, while MI cases who were not smokers had 3 timesthe odds of having used oral contraceptives than controls.
60
Epidemiologic models can be useful for all of the following except: a. Predicting effectiveness of programs b. Organizing and storing knowledge about a disease process c. Predicting risk or consequences of disease d. Identifying an individual’s risk factors for disease e. Developing policy
d. Identifying an individual’s risk factors for disease
61
To be effective, surveillance systems should incorporate all of the following except: a. Generation of information for action b. Disease eradication c. Ongoing data collection d. Systematic data collection e. Timely information dissemination
b. Disease eradication
62
Surveillance system design should aim at: a. Maximizing the probability of true early detection b. Incorporating as many sampling architectures as possible c. Minimizing the probability of a false-positive alarm d. All of the above e. a and c only
e. a and c only
63
It is important if treatment at the pre-symptomatic stage has a more favorable outcome than treatment initiated once the patient is symptomatic. T/F
T
64
The lead time is defined as the period in the natural history of the disease in which treatment is more effective and/or less difficult to administer. T/F
F When disease is detected by screening, the time of diagnosis is advanced to an earlier point in the disease’s natural history. The lead time is defined as the interval by which the time of diagnosis is advanced by early detection of disease through screening compared with the usual time of diagnosis. The critical point in the natural history of a disease is the point before which treatment is more effective and/or less difficult to administer. If a disease is potentially curable, cure may be possible before this point, but not after. Although the critical point is a theoretical concept (we usually cannot identify when the critical point is reached in a given disease), it is important in that if there is no critical point, there is clearly no rationale for screening and early detection.
65
In order for a screening program to be effective, there does not need to be an accepted treatment for patients identified with the disease. T/F
F
66
If α is our false-positive error rate, or the probability of making a Type I error, and βis our false negative error rate, or the probability of making a Type II error, what is power?
Power is 1 – β, the probability of detecting a difference if one truly exists
67
For a given α and measure of association, how can an investigator increase the power of a study?
Increase the sample size
68
Probability sampling a. Refers to several sampling strategies b. Allows investigators to generalize the results from the sample to the population c. Allows calculation of the standard error of the resulting population estimates d. All of the above e. a and b only
d. All of the above
69
A true lack of association may be difficult or impossible to distinguish from a true association that cannot be detected statistically because of inadequate _________. a. α (alpha) b. β (beta) c. Power d. Detection rates e. Error rates
c. Power
70
The use of a Geographic Information System (GIS) allows an investigator to assessall of the following except: a. The cohort effect b. Whether there is a spatial pattern c. Whether patterns co-distribute d. If risk factors differ with location e. How disease spreads
a. The cohort effect
71
The mortality rate from disease X in city A is 75/100,000 in persons 65 to 69 years old. The mortality rate from the same disease in city B is 150/100,000 in persons 65 to 69 years old. The inference that disease X is two times more prevalent in persons 65 to 69 years old in city B than it is in persons 65 to 69 years old in city A is: a. Correct b. Incorrect, because of failure to distinguish between prevalence and mortality c. Incorrect, because of failure to adjust for differences in age distributions d. Incorrect, because of failure to distinguish between period and point prevalence e. Incorrect, because a proportion is used when a rate is required to support the inference
B. The two studies are describing mortality rates, and mortality can differ in two populations for many reasons other than the prevalence of the disease (e.g., other risk factors for death in one population but not the other).
72
To study the causes of an outbreak of aflatoxin poisoning, investigators conducted a case-control study with 40 cases and 80 controls. Among the 40 poisoning victims, 32 reported storing their maize inside. Among the 80 controls, 20 stored their maize inside. The resulting OR for the association between storing maize inside and aflatoxin poisoning is: a. 3.2 b. 5.2 c. 12.0 d. 33.3
c. 12 Cases (Poisoning) Controls (No Poisoning) Total exposed 32 20 52 (stored inside) Unexposed 8 60 68 (not inside) Total 40 80 120 Odds cases= exposed/unexposed = 32/8=4 Odds controls = exposed/unexposed = 20/60 = 1/3 OR = odds cases/odds controls = 4/0.333333333333 = 12
73
A study seeks to assess birth characteristics of dairy calves in a population. Which of the following variables describes the appropriate measurement scale or type? A. Continuous B. Ordinal C. Nominal D. Dichotomous a. _______ Birthweight in kilograms b. _______ Birthweight classified as low, medium, high c. _______ Dam classified as primiparous, multiparous d. _______ Delivery type classified as natural, assisted, cesarean
a. Continuous b. Ordinal c. Dichotomous d. Nominal
74
A large study of serum cholesterol levels in patients with diabetes mellitus reveals that the parameter is normally distributed with a mean of 230 mg/dL and standard deviation of 10 mg/dL. According to these results, 95% of serum cholesterol observations in these patients lie between which of the following limits? a. 220 and 240 mg/dL b. 225 and 235 mg/dL c. 210 and 250 mg/dL d. 200 and 260 mg/dL e. 220 and 260 mg/dL
C. This question is asking about a 95% confidence interval. If you remember, for a normallydistributed variable, the lower bound of this confidence interval is: mean – 1.96 *SD, and the upperbound is mean + 1.96*SD. So here, the lower bound of the CI will be 230-1.96*10=210 and the upperbound will be 230+1.96*10=250.
75
Veterinarians at a veterinary clinic report an increased incidence of lymphoma in dogs seen at their clinic. They note that some households in the community are exposed to chemical waste from a nearby factory. They believe that chemical waste causes lymphoma. If a study is designed to evaluate this claim, which of the following subjects are most likely to comprise the control group? a. Dogs exposed to the chemical waste that do not suffer from lymphoma b. Dogs not exposed to the chemical waste who do not suffer from lymphoma c. Dogs from the clinic that do not suffer from lymphoma d. Dogs not exposed to the chemical waste that suffer from lymphoma e. Dogs that suffered from lymphoma but got cured
C. Given that lymphoma is relatively rare and we have a suspected exposure, the best option herewould be a case-control study. The cases are dogs with lymphoma. The controls would be dogs that do not have the outcome (not suffering from lymphoma). When identifying controls, we do not yet know about their exposure, so we don’t select anyone BASED on their exposure (what kind of study would we specifically doing that for?), we select only according to their outcome.
76
A survey asked people how often they exceed speed limits. The data are then categorized into the following contingency table of counts showing the relationship between age group and response. Age Under 30 30 or Over Excess Speed Limit? Always 100 40 Never 100 160 In people under the age of 30, what is the risk of always exceeding the speed limit? a. 0.20 b. 0.40 c. 0.33 d. 0.50
note that the 2x2 table is not set up in the conventional way with outcome in the columns and exposures in the rows – it’s the opposite! While this might seem tricky or like a gotcha, this is to emphasize that you need to be able to derive the formula and not just apply a memorized formula from cell numbers. d. Remember, risk = those with the outcome/everyone in your category of interest. So here,100/(100+100)=0.5
77
A survey asked people how often they exceed speed limits. The data are then categorized into the following contingency table of counts showing the relationship between age group and response. Age Under 30 30 or Over Excess Speed Limit? Always 100 40 Never 100 160 Among people under age 30, what are the odds that they always exceed the speed limit? a. 0.50 b. 2.0 c. 1.0 d. 0.05
note that the 2x2 table is not set up in the conventional way with outcome in the columns and exposures in the rows – it’s the opposite! While this might seem tricky or like a gotcha, this is to emphasize that you need to be able to derive the formula and not just apply a memorized formula from cell numbers. c. Remember that odds= p/(1-p), with p being the probability or risk of the outcome. Here,0.5/(1-0.5)=1.0.
78
A survey asked people how often they exceed speed limits. The data are then categorized into the following contingency table of counts showing the relationship between age group and response. Age Under 30 30 or Over Excess Speed Limit? Always 100 40 Never 100 160 What is the relative risk of always exceeding the speed limit for people under 30 compared to people over 30? a. 2.5 b. 4.0 c. 0.5 d. 0.3
note that the 2x2 table is not set up in the conventional way with outcome in the columns and exposures in the rows – it’s the opposite! While this might seem tricky or like a gotcha, this is to emphasize that you need to be able to derive the formula and not just apply a memorized formula from cell numbers. a. Read the question carefully and don’t jump to the cross-product ad/bc. That calculation would giveus the odds ratio. Here, it’s asking relative risk, which is risk in under 30/risk in over 30 = (100/200)/(40/200)=2.5
79
A survey asked people how often they exceed speed limits. The data are then categorized into the following contingency table of counts showing the relationship between age group and response. Age Under 30 30 or Over Excess Speed Limit? Always 100 40 Never 100 160 What is the attributable risk of being under 30 to always exceeding the speed limit? a. 0.20 b. 0.40 c. 0.60 d. 0.80
note that the 2x2 table is not set up in the conventional way with outcome in the columns and exposures in the rows – it’s the opposite! While this might seem tricky or like a gotcha, this is to emphasize that you need to be able to derive the formula and not just apply a memorized formula from cell numbers. The attributable risk percent is: (the risk in the exposed – risk in the non-exposed) / risk in theexposed. Here, the risk in the exposed is 100/200=0.5; the risk in the non-exposed is 40/200=0.2. AR%=(0.5-0.2)/0.5=0.6.
80
A study is conducted to assess the relationship between breed and end-stage renal disease in cats. Two groups of pathologists independently study specimens from 1,000 kidney biopsies. The first group of pathologists is aware of the breed of the patient from whom the biopsy came, while the second group is blinded as to the patient’s breed. The first group reports ‘hypertensive nephropathy’ much more frequently for domestic shorthairs than the second group. Which of the following types of bias is most likely present in this study? a. Confounding b. Nonresponse bias c. Recall bias d. Referral bias e. Observer bias
E. Because the pathologist or “observer” is not blinded, they may have a pre-existing suspicion that a certain breed is associated with the outcome and may therefore more aggressively seek to find the disease of interest in that breed
81
In a study of the association between body condition score and progression of chronic kidney disease in dogs, investigators followed dogs over time and produced the following graph. 9. What kind of analysis does this represent, and what is the name of this graph? a. Cohort study; Meyers-Brigg curve b. Prognosis analysis; Mann-Whitney curve c. Survival analysis; Kaplan-Meier curve d. Time-to-event analysis; Bayesian curve
c. This is a survival analysis (also called time-to-event analysis), and the curve is called a Kaplan-Meier survival curve.
82
What is the approximate median survival time of dogs with the highest BCS? a. 25 days b. 200 days c. 350 days d. 600 days
c. As a reminder, the median survival time is the time point when half of the population is still alive(or, if you look at it the other way, when half of the population has died or met the outcome of interest).Look at the top curve (where BCS=7-9). Find the point on the curve where the percent surviving is 50%.Find the x-axis value for this point. That is your median survival time.
83
You are working for a local health department as head of contact-tracing for the COVID epidemic. As you contact people who may have been exposed, you need to make sure they are aware of how soon after the potential exposure they will become infectious to other people. This period represents the _______________ of the disease. a. Incubation period b. Latent period c. Clinical period d. Infectious period
b. The latent period is the period from the time of infection until the infected individual is able totransmit the infection. The incubation period is the period from the time of infection to thedevelopment of symptoms.
84
The p-value (check all that apply) a. Represents the probability of the observed results being obtained by chance b. Represents the probability that the null hypothesis is true c. Roughly represents compatibility between the data and the null hypothesis d. Represents the probability of making a type 1 error e. Is derived from the chi-square distribution f. Is large when the data are very compatible with the null hypothesis
c and f a. Represents the probability of the observed results being obtained by chance – this is FALSE becauseit’s missing the all important part of the statement “if the null hypothesis is true” b. Represents the probability that the null hypothesis is true – this is false; the p-value represents theprobability of observing an effect as extreme or more than we have if the null hypothesis is true c. Roughly represents compatibility between the data and the null hypothesis – not a technicaldefinition, but it’s the one that I find most compelling in terms of understanding the concept. Becausethe p-value represents the probability of observing an effect as extreme or more than we have obtainedif the null hypothesis is true, it gives us an indication of how compatible the data we have are with thenull hypothesis – if the p-value is very low, then the likelihood of observing the data if the nullhypothesis is true is very low – hence, the data are not compatible with the null hypothesis. d. Represents the probability of making a type 1 error – this is false – the threshold at which we decideto reject the null hypothesis (usually 0.05) is the the probability of making a type I error e. Is derived from the chi-square distribution – not always true – the p-value CAN be derived from then chi-square distribution if our outcome and predictor variable are both categorical, but it is derived from other distributions if the outcome or predictor variables are not categorical. f. If the p value is large, the data are very compatible with the null hypothesis, and we therefore do NOTreject the null hypothesis. If the p value is very small (like less than 0.05), the data are not verycompatible with the null hypothesis and we can then confidently reject the null hypothesis and acceptthe alternative hypothesis
85
A randomized trial comparing the efficacy of two drugs showed a difference between the two (with a P value < 0.05). Assume that in reality, however, the two drugs do not differ. This is therefore an example of: a. Type I error (alpha error) b. Type 2 error(beta error) c. 1- alpha d. 1- beta e. None of the above
A. Type I error is the equivalent of a “false positive” – detecting something (i.e., an effect), when there really is nothing there. When designing hypothesis-testing studies, we “set” the parameters for type I and 2 error when performing our sample size calculation (i.e., alpha and 1-beta, which is the power of the study). We often want to limit the occurrence of type I errors more than type II errors(when we don’t find a true association), which is why alpha is usually set to be smaller than beta (often5% vs 10-20% in sample size calculations).
86
In cohort studies of the role of a suspected factor in the etiology of a disease, it is essential that (select all that apply): a. There be equal numbers of persons in both study groups b. At the beginning of the study, those with the disease and those without the disease have equal risks of having the factor c. The study group with the factor and the study group without the factor be representative of the general population d. The exposed and nonexposed groups under study be as similar as possible with regard to possible confounding factors
D is correct because confounding is a major problem in observational studies such as cohort studies, so if you are aware of a potential confounder, you want to make sure it’s similarly distributed in both groups so that it does not become a true confounder. For A, it’s nice if both groups have the same number of people, but it is certainly not necessary. Answer B sounds nice, and certainly both groups should be capable of having the outcome (i.e., you wouldn’t want to enroll males in a study of pregnancy), but different risks of the exposure are actually something you’re looking to study. Answer C – again, it depends what the question is. If you’re looking at an exposure that only affects a certain population, you only care about generalizability to that specific population (not the “general” population).
87
In general, how does non-differential misclassification bias affect the odds ratio or relative risk? For a bonus, explain why. a. It becomes smaller than 1. b. It becomes larger than 1. c. It gets closer to 1. d. It can go in either direction. e. It has no effect.
C. Non-differential misclassification of a dichotomous outcome will generally bias toward the null (i.e., closer to a RR or OR of 1, or no effect). This is usually because it dilutes the true effect.
88
In general, how does differential misclassification bias affect the odds ratio or relative risk? For a bonus, explain why. a. It becomes smaller than 1. b. It becomes larger than 1. c. It approaches 1. d. It can go in either direction. e. It has no effect.
D. Non-differential misclassification can move the numerical value of the OR or RR closer to one or further away from one. That is, it can underestimate or overestimate the strength of association (you rarely know which until it happens). To illustrate differential misclassification of outcome Rothman uses the following example: "Suppose a follow-up study were undertaken to compare incidence rates of emphysema among smokers and nonsmokers. Emphysema is a disease that may go undiagnosed without unusual medical attention. If smokers, because of concern about health effects of smoking (such as bronchitis), seek medical attention to a greater degree than nonsmokers, then emphysema might be diagnosed more frequently among smokers than among nonsmokers simply as a consequence of the greater medical attention. Unless steps were taken to ensure comparable follow-up, an information bias would result. An 'excess' of emphysema incidence would be found among smokers compared with nonsmokers that is unrelated to any biologic effect of smoking. This is an example of differential misclassification, since the underdiagnosis of emphysema, a misclassification error, occurs more frequently for nonsmokers than for smokers."
89
You have conducted a cohort study examining the effect of climate on osteoarthritis in pet dogs. Your exposure is cold vs warm winters. While you did your best to match pets by age, you are concerned that there might be some confounding by age in your study. You sort the dogs into one of two age categories, less than 7 years of age and 7 years or greater. Here are some possible results. What do you conclude in each case? Original odds ratio was 2.60, stratified odds ratio are 2.57 for the younger dogs and 2.62 for the older dogs. a. There was confounding by age and the confounder was associated with an increase in disease incidence. b. There was confounding by age and the confounder was associated with a decrease in disease incidence. c. There was no confounding by age.
C. The new odds ratios are almost identical to the original odds ratios therefore there was no confounding.
90
You have conducted a cohort study examining the effect of climate on osteoarthritis in pet dogs. Your exposure is cold vs warm winters. While you did your best to match pets by age, you are concerned that there might be some confounding by age in your study. You sort the dogs into one of two age categories, less than 7 years of age and 7 years or greater. Here are some possible results. What do you conclude in each case? Original odds ratio was 2.60, stratified odds ratio are 1.5 for the younger dogs and 1.7 for the older dogs. a. There was confounding by age and the confounder was associated with an increase in disease incidence. b. There was confounding by age and the confounder was associated with a decrease in disease incidence. c. There was no confounding by age.
A. The new odds ratios are >15% different from the original odds ratios therefore confounding wasoccurring. Because the stratified estimates are smaller than the original estimates, then the confounder was associated with an increase in disease incidence.
91
A group of investigators was interested in testing diagnostic ultrasound as a screen for cystic hydatid disease in sheep and goats in Kenya. The gold standard was post mortem examination of the liver and lungs. In ruminants, intestinal gases sometimes obscure the cysts so ultrasound gives some false negatives. Also, Taenia hydatigena cysts can be confused for Echinococcus granulosus cysts and so ultra sound gives some false positives. Three hundred sheep and goats were tested: 31 were positive on ultra sound and 46 were positive on post mortem examination; of these twenty five were positive on both. What is the sensitivity, specificity and positive predictive value of ultrasonography with respect to cystic hydatid disease in sheep and goats?
Hopefully you know by now that in questions like this you have to construct the 2x2 table yourself. So, (1) draw the table, (2) put in any row or column totals first, and then and only then (3) insert those values of A, B, C or D that you know and get the rest by subtraction from the column or row totals. Post-Mortem Totals Ultrasound Positive Negative Positive 25 6 31 Negative 21 248 269 Totals 46 254 300 Sensitivity = 25/(25+21)= 0.54 Specificity = 248/(248+6) = 0.98 PPV = 25/(25+6) = 0.81
92
Georgiadis et al. (2000) conducted a prospective cohort study of risk factors associated with the clinical signs of iridovirus and herpesvirus-2 infections in a commercial sturgeon farm in California. They were interested in two clinical signs: mortality and runting (reduced live-weight gain). One of the risk factors was “spawn”, that is the particular mating from which the fish eggs were obtained. Here are some of their data. Proportion of fish Proportion of runts dead at sale time at sale time 3rd Spawn 0.02 0.16 5th Spawn 0.25 0.11 What is the relative risk of mortality and runting in fish derived from the fifth spawn compared with those derived from the third spawn? a. 12.5 and 0.69 b. 0.08 and 1.45 c. 0.055 and 0.69 d. 0.005 and 0.50 e. 12.5 and 1.45
A. Note that it is important to not jump to the immediate conclusion that this is your standard2x2 table from which one derives an odds ratio.In fact, the column headings here are, bydefinition, the cumulative incidence of death and runting respectively. Therefore: Mortality RR= 0.25/0.02 = 12.5 and RR for stunting=0.11/0.16=0.69. It’s also important to be able to quicklyand efficiently translate a word problem into the relevant table and perform the necessarycalculation.
93
The authors of this study concluded that “spawn” was not associated with runting. Given your answer to the previous question, what are two possible reasons for this conclusion? I. The outcome was not normally distributed. II. The confidence interval of the effect estimate contained 1. III. A confounding variable was detected in the analysis. IV. Because runting is relatively common, the assumption that the odds ratio approximates the relative risk is not valid. a. I & II b. II & III c. III & IV d. I & III e. II & IV
B. A confidence interval including one indicates a non-significant association (like a p-value > 0.05),so this is a valid reason to accept the null hypothesis. The discovery of a confounder that resulted instratified estimates equal to 1.0 could be another reason. Answer I makes no sense, as the outcome wasa dichotomous variable (runt vs not), even though the table shows a continuous outcome (proportion).Answer IV is incorrect, because the rare disease assumption applies to a case-control study where wecan only calculate the odds ratio. Because this is a cohort study, we have direct access to the relative risk.
94
In many studies examining the associations between estrogens and endometrial cancer, a one-sided significance test was used. The underlying assumption justifying a one-sided rather than a two-sided test is: a) The distribution of the proportion exposed followed a “normal” pattern b) The expectation before starting the study was that estrogens cause endometrial cancer c) The pattern of association could be expressed as a straight-line function d) Type II error was the most important potential error to avoid e) Only one control group was being used
B. When using a two-tailed test, regardless of the direction of the relationship you hypothesize, youare testing for the possibility of the relationship in both directions. For example, we may wish tocompare the mean of a sample to a given value x using a t-test. Our null hypothesis is that the mean isequal to x. A two-tailed test will test both if the mean is significantly greater than x and if the meansignificantly less than x. When using a one-tailed test, you are testing for the possibility of therelationship in one direction and completely disregarding the possibility of a relationship in the otherdirection. These investigators are therefore assuming that estrogens could only cause endometrial cancer – there test does not account for the possibility that estrogens could be protective against endometrial cancer.
95
Factors A, B and C can each individually cause a certain disease without the other two factors, but only when followed by exposure to factor X. Exposure to factor X alone is not followed by the disease, but the disease never occurs in the absence of exposure to factor X. Factor A is a ____________ and Factor X is _______________. I. A necessary cause and sufficient cause II. A necessary cause but not sufficient cause III. A sufficient but not necessary cause IV. Neither necessary nor sufficient
Factor A= IV, neither necessary nor sufficient. Factor X= II, A necessary but not sufficient cause; Note that factors that are neither sufficient nor necessary represent a more complex model but probably more accurately represents the causal relationships that operate in most chronic diseases.
96
A group of investigators conducted a study to examine the correlation between dietary fat intake and breast cancer by country and produced the following graph ("dietaryfat_breastcancer"). What would you conclude when looking at this graph? a) Increased intake of dietary fat causes breast cancer b) Increased intake of dietary fat is associated with an increase in the rate of breast cancer c) Increased intake of dietary fat might be associated with an increase in the rate of breast cancer, but we cannot say for certain due to the possibility of ecological fallacy d) Decreased intake of dietary fat is associated with a low incidence of breast cancer.
C. The ecological fallacy is an important concept to understand – in which we ascribe to members of a group characteristics that they in fact do not possess as individuals. This problem arises in ecologic studies because we only have data for groups; we do not have exposure and outcome data for each individual in the population. Is there any use for ecologic studies then? Yes, they are useful for hypothesis generation. However, in and of themselves, they do not demonstrate conclusively that a causal association exists.
97
An investigator wants to study the effect of a dog’s body condition score on the likelihood of experiencing orthopedic disease such as CCL tears or osteoarthritis. Which statistical test / method should the investigator use to test her hypothesis that a higher BCS is associated with a higher likelihood of experiencing orthopedic disease? a. A chi-square test b. A t-test c. Linear regression d. Logistic regression e. Ordinal logistic regression f. Poisson regression
D. Logistic regression – the outcome is categorical (dz/no dz) and the predictor iscontinuous. When the logistic regression is run, we will obtain a relative risk telling us how much every 1-unit change in BCS increases our risk of having an orthopedic injury.
98
What if the investigator from the previous question wanted to examine the effect of BCS on the number of orthopedic injuries a dog experiences? a. A chi-square test b. A t-test c. Linear regression d. Logistic regression e. Ordinal logistic regression f. Poisson regression
F. Poisson regression – the outcome is a count (# of orthopedic injuries). The Poisson regression model will tell us the percent change in the # of occurrences of orthopedic disease for every 1-unit increase in BCS.
99
Talbot and colleagues carried out a study of sudden unexpected death in women. Refer to the following table ("ASHD_smoking") and answer the questions below. Calculate the matched-pairs odds ratio for these data. Using data from the table, un-match the pairs and calculate an unmatched odds ratio. What are the odds that the controls smoke 1+ pack/day?
Since this is a matched case-control study, you look at the quotient of the informative cells (where exposure differed for cases and controls) – so cell b/c=36/8=4.5 To un match the pairs and create a typical 2x2 table, you take the marginal sums of each category. So an unmatched table becomes: Cases Controls Smoke more than 1 pack 38 10 Smoke less than 1 pack 42 70 Then, you have your typical 2x2, and OR=ad/bc=(38*70)/(42*70)=6.3 Remember, odds is the ratio of the probability of one event to that of an alternativeevent. So here, among controls, the odds of smoking more than 1 pack = probably of smoking more than one pack/probability of smoking less than 1 pack = 10/70 =0.14.
100
Choose the appropriate statistical test: A study comparing the proportions of male and female dogs that died from a certain disease. the number of animals enrolled is large, so you can expect a relatively large number of animals in each category. a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
b. CHI-SQUARE TEST – outcome is categorical and exposure variable is too, so we wouldhave a 2x2 table and use the chi-square test to test the hypothesis that one type of sex ismore or less associated with death from disease.
101
Choose the appropriate statistical test: a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
102
Choose the appropriate statistical test: A randomized controlled trial assessing the effect of a new drug on blood pressure in cats with hyperthyroidism. There are two arms in the trial - drug and placebo. Blood pressure is normally distributed in the cats. a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
d. T-TEST. We’ll have a mean blood pressure for the cats in the treatment group and a mean blood pressure for the cats in the placebo group, and since these groups are independent of each other, we do a t-test.
103
Choose the appropriate statistical test: A randomized controlled trial assessing the effect of a new supplement on milk production in dairy cows. There are three arms in the trial - placebo, supplement low-dose, supplement high-dose. Milk production is normally distributed. a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
e. ANOVA. Same explanation as above, but here there are three groups, so we use ANOVA, which is the equivalent of t-test for more than 2 groups that have normally distributed means.
104
Choose the appropriate statistical test: A study examining the effect of a weight loss program. The study compares pre-diet weight with post-diet weight, and weight is normally distributed. a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
c. PAIRED T-TEST. Here the outcome is a continuous variable (pre-diet and post-diet weight), and they are paired because they are coming from the same person.
105
Choose the appropriate statistical test: A study examining the effect of track surface (sand vs grass vs dirt) on race horse winnings. Winnings data are skewed. a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
g. Kruskal-Wallis Test
106
Choose the appropriate statistical test: A study examining the effect of vaccination for Ebola virus on mortality during an outbreak. The investigators also want to adjust for a patient’s age and gender. a. McNemar's Test b. Chi-Square Test c. Paired T-Test d. T-Test e. ANOVA f. Logistical Regression g. Kruskal-Wallis Test
f. LOGISTICAL REGRESSION - Because we have multiple predictor variables here (vaccination, age, gender), we want to usemultivariable regression modeling. Since the outcome here is categorical (death yes/no), we will wantto use logistic regression