Flashcards in Stats: evidence appraisal practice questions Deck (17):

1

##
You are reading an abstract summarizing the results of a clinical study in which

blood pressure was measured on 100 men with hypertension before and after treatment with a

new antihypertensive drug. The conclusion (no data given) was that the drop in mean blood

pressure following treatment was highly significant (p

###
(D) From the abstract, it is clear that the investigators were interested in reducing blood pressure by using the new drug. A one-sided test is appropriate if prior to the conduct of the trial, we are interested in rejecting the null hypothesis of no change in favor of an alternate hypothesis

that the change is in a particular direction.

2

##
You are reading an abstract summarizing the results of a clinical study in which

blood pressure was measured on 100 men with hypertension before and after treatment with a

new antihypertensive drug. The conclusion (no data given) was that the drop in mean blood

pressure following treatment was highly significant (p

###
(B) The comparison is between means (pre and post treatment) and the data are paired (two

measurements on each patient) so the appropriate test is the paired t test.

3

##
. You are reading an abstract summarizing the results of a clinical study in which

blood pressure was measured on 100 men with hypertension before and after treatment with a

new antihypertensive drug. The conclusion (no data given) was that the drop in mean blood

pressure following treatment was highly significant (p

###
(D) "Highly significant, p implies that the probability is less than .0005 of observing a drop

as large or larger in BP as what was seen, when the drug actually had no effect. The fact that

significance tests do not directly relate to clinical importance or cause -and-effect rules out

A,B,C.

4

##
Cortisol levels were measured in two independent groups of women at

childbirth. Group 1 underwent emergency Caesarean section following induced labor. Group 2

delivered by either Caesarean section or the vaginal route following spontaneous labor.2

The number of women (n), mean levels, and standard deviations were as follows:

Group n Mean Std. dev.

1 10 535 60

2 10 645 70

To compare the mean cortisol levels for statistical significance you would use

A. unpaired t-test

B. paired t-test

C. chi square test

D. Fisher’s exact test

E. Doesn’t matter since sample sizes are equal

###
(A) Since the object is to compare means (not proportions) and the groups are independent (not

paired or matched), the unpaired t-test should be used.

5

##
Cortisol levels were measured in two independent groups of women at

childbirth. Group 1 underwent emergency Caesarean section following induced labor. Group 2

delivered by either Caesarean section or the vaginal route following spontaneous labor.2

The number of women (n), mean levels, and standard deviations were as follows:

Group n Mean Std. dev.

1 10 535 60

2 10 645 70

The researchers reported that the p-value for the comparison of mean cortisol levels between

groups was 0.0014. Which of the following conclusions can NOT be drawn from this

information?

A. The difference is statistically significant at the 5% level

B. The difference is not statistically significant at the 0.1% level (α = 0.001)

C. If there is truly no difference between the two groups, the probability of observing a difference

at least as large as (645 – 535) is less than 1%.

D. Inducing labor causes reduced cortisol

E. A 95% confidence interval for the difference in cortisol between the two groups would not

include zero.

###
. (D) Statistical significance alone does not imply causation. The p-value (.0014) is less than .05 so

the difference is significant at the 5% level and A is true. The p-value is greater than .001 so the

difference is not significant at the .1% level and B is true. C is the definition of a p-value. E is

true because of the relationship between p-values and confidence intervals: a 95% confidence

interval for a difference will not include the null value (no difference) if the difference is

statistically significant at the 5% level.

6

##
A clinical trial is being planned in which a new drug (A) is to be compared to the drug in current

use (B). Patients will be randomly allocated into two groups --one group to receive drug A, the

other group drug B. Patients in each group will have systolic blood pressure (SBP)

measurements taken during a baseline period and after a prescribed period on the drug

therapy. It is planned to determine the effectiveness of the new drug by comparing the

difference in mean SBP changes (mean drop with drug A compared to mean drop with drug B)

with a t-test to determine whether the new drug (A) is better than the current drug (B) in

reducing blood pressure on average. As implied by the discussion, the investigators would use

the:

A. unpaired t-test (one-sided)

B. paired t-test (one-sided)

C. unpaired t-test (two-sided)

D. paired t-test (two-sided)

E. unpaired t-test (three-sided)

###
(A) The fact that the patients are to be "randomly allocated into two groups" implies an unpaired

design. The last sentence implies that they are interested in detecting a difference in one

direction only (new drug better than old), so a one-sided test should be done.

7

##
A summary of a randomized clinical trial of two treatments states that "no significant difference

(p > .05)" was found in treatment outcomes. Based on this, you should conclude that the

difference in treatment outcomes

A. is due to chance.

B. is due to the treatment

C. is not of clinical interest.

D. could be of clinical interest, if the sample sizes are large enough so that there is little likelihood

of missing an important difference.

E. could be clinically important, if the observed difference were large enough and if the sample

sizes are too small to yield much power to detect such a difference.

###
(E) A non-significant p-value alone is not sufficient information to rule out either chance or

treatment effect as explanations for the observed difference. Conclusions about clinical

importance are made based on the observed difference and confidence interval, which are not

reported here. If the observed difference is large enough to be clinically important, then the

sample size was too small to detect this difference as statistically significant. If the observed

difference is of little clinical importance, then one needs to ensure that the trial had sample

sizes large enough to detect any difference of clinical importance.

8

##
The following data are results from a comparative study of two diagnostic tests (A and B) for a given condition. Sixty patients known to have the condition were tested with

both diagnostic tests.

(+) B (-) total

(+) 36 11 47

A

(-) 3 10 13

39 21 60

The estimated sensitivity of test A is

A. 36/39

B. 39/60

C. 36/47

D. 47/60

E. none of the above

###
(D) Of 60 patients known to have the condition, 47 tested positive with test A; this implies the

sensitivity of test A = 47/60.

9

##
The following data are results from a comparative study of two diagnostic tests (A and B) for a given condition. Sixty patients known to have the condition were tested with

both diagnostic tests.

(+) B (-) total

(+) 36 11 47

A

(-) 3 10 13

39 21 60

You wish to use a significance test to compare the sensitivities of tests A and B. An appropriate

test would be

A. unpaired t-test

B. McNemar's test

C. paired t-test

D. chi-square test for independent proportions

E. Fisher's exact test.

###
(B) Sensitivities are calculated as proportions. Since each patient had both tests administered, we

have paired data, and to compare paired proportions, McNemar's test is used.

10

##
The following abstract describes a trial of the use of antibiotic prophylaxis

against gonorrhea. The study subjects were volunteers from a crew of a large naval vessel 4

operating in the western Pacific in 1974 who were then randomly assigned to receive either

antibiotic (100 mg minocycline) or placebo before taking liberty.

Abstract: In a prospective evaluation of antibiotic prophylaxis against gonorrhea, 1080 men

were given 200 mg of oral minocycline or placebo after sexual intercourse with prostitutes in a

Far Eastern port. Later at sea, gonococcal infection was detected in 57 of 565 men given

placebo and 24 of 515 men given minocycline (P

###
(C) The null hypothesis is always one of no difference; here the interest is to compare the

infection rates of placebo and treatment groups.

11

##
The following abstract describes a trial of the use of antibiotic prophylaxis

against gonorrhea. The study subjects were volunteers from a crew of a large naval vessel 4

operating in the western Pacific in 1974 who were then randomly assigned to receive either

antibiotic (100 mg minocycline) or placebo before taking liberty.

Abstract: In a prospective evaluation of antibiotic prophylaxis against gonorrhea, 1080 men

were given 200 mg of oral minocycline or placebo after sexual intercourse with prostitutes in a

Far Eastern port. Later at sea, gonococcal infection was detected in 57 of 565 men given

placebo and 24 of 515 men given minocycline (P

###
(B) Since subjects are assigned to one of two independent groups and we are comparing

proportions, the Chi-square test is appropriate.

12

## The reference to "(p

### (D) "P

13

## The reference to "(p

###
(A) The corresponding 95% confidence interval must include the estimated rate ratio (2.16).

Because p

14

##
You are reading a report of a clinical trial of two treatments in which a large number of

treatment outcomes (variables) were compared for "statistical significance". Such "multiple

testing" causes difficulty in interpreting reported statistically significant differences because of

A. decreased power.

B. increased probability of a Type II error.

C. decreased positive predictive value.

D. increased probability of a Type I error.

E. decreased negative predictive value.

###
(D) Even if there are no real differences between the treatments for any of the variables tested, if

the tests were performed at the 5% alpha level, we expect 5% of the conclusions from the tests

to be "false positives", i.e., claiming a difference exists when in fact none exists. The more tests

done, the higher the chance of at least one Type I error.

15

##
Prior to 1982, several large outbreaks of leptospirosis occurred in troops deployed to Panama

for jungle warfare training. In a field trial of the efficacy of doxycycline to prevent leptopirosis,

doxycycline (200 mg.) or placebo was administered by tablet on a weekly basis and at the

completion of training to 940 volunteers from 2 U.S. Army units deployed to Panama for

training. Twenty cases of leptospirosis occurred in the placebo group (attack rate of 4.2%)

compared to only 1 case in the doxycycline group (attack rate of 0.2%, P

###
(A) The only two tests listed that are used to compare unpaired proportions are Fisher's exact

test and the z-test. Fisher's exact test is often recommended when small proportions are to be

compared.

16

##
Researchers at USUHS compared insulin levels after

treatment between Caucasian (CA) and African American (AA) children. Data are described in the

boxplot to the right. The most appropriate test for

comparing the two groups would be:

A. Mann-Whitney (nonparametric) test

B. Wilcoxon matched-pairs signed ranks test

C. Unpaired t test

D. Paired t test

E. Analysis of variance

###
(A) The boxplots illustrate that the data are skewed: Most of the values fall near the bottom of

the range, and there are a few extremely large observations. The mean may not be a good

summary of this type of data, and a nonparametric test should be used to compare medians

instead. Both the Mann-Whitney and Wilcoxon tests are non-parametric tests, but the

Wilcoxon signed ranks test is for paired data. This study compares two independent groups.

17