What are the four components needed to estimate sample size?

**Population** - that fits your demographic - this is the P in pico. Can be unknown or estimated.**Margin of Error (Confidence Interval)** — how much error you want to allow, this defines how much higher or lower than the population mean, that the sample mean will fall. Usually +/- 0.5%**Confidence level** - how confident you are that the actual mean falls within the confidence interval (90, 95, 99% are the most common and correlate to Z-scores of 1.6, 2.0, and 2.3 respectively)**Standard deviation** - how much variance to allow - standard is .5%

Necessary Sample Size = (Z-score)² – StdDev*(1-StdDev) / (margin of error)²

What is an appropriate drop-out rate for a trial?

###
If less than 80% are followed up it is generally recommended that the result is ignored.

If the drop-out rates are high, how confident can you be in the final results? What if all the drop-outs had a bad outcome?

##
What does control, experimental, and patient expected event rates mean?

** Control Event Rate (CER) **

The rate at which events occur in the control group e.g. in a RCT of aspirin v placebo to prevent MI, a CER of 10% means that 10% of the placebo group had a MI. It is sometimes represented as a proportion (10% = 10/100= 0.1). **Experimental Event Rate (EER)**

The rate at which events occur in the experimental group e.g. in the CER example above, an EER of 9% (or 0.09) means that 9% of the aspirin group had a MI. **Patient expected event rate**

The patient expected event rate (PEER) refers to the rate of events we'd expect in a patient who received no treatment or conventional treatment.

Define Type 1 and Type 2 error in stats.

**Type 1 error** - a positive result when there is no real difference. This is a *false positive***Type 2 error** - no significant difference is found when there is actually a real treatment difference. This is a *false negative*

Small studies with a wide CI are prone to these errors.

Out of interest...

If you see an unexpected positive result (e.g. a small trial shows willow bark extract is effective for back pain) think: could this be a type 1 error? After all, every RCT has at least a 1 in 20 chance of a positive result and a lot of RCTs are published...

If a trial shows a non-significant result, when perhaps you might not have expected it, think could this be a type 2 error? Is the study under-powered to show a positive result?

Systematic reviews, which increase study power and reduce CI, are therefore very useful at reducing Type 1 and 2 error.

What is the null hypothesis?

States there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.

In a clinical trial presenting two survival curves, how is the absolute benefit of treatment is best described? What is the significance of a plateau?

###
The median 'increase' in survival time (when comparing treatment to placebo or other).

The median survival is the time at which the percentage surviving is 50%. If more than half the patients are cured, there is no such point on the survival curve and the median is undefined (and often described as greater than the longest time on the curve). I like undefined medians!

Curves which flatten to a level plateau, suggest that patients are being cured, and curves which descend all the way to zero, imply that no one (or almost no one) is cured.

A median survival or survival percentage at x years wont give you the full story. The drop off may continue past 5 years but then flatten out at 6 - which means no deaths. People fortunate enough to make it out to six or seven years may well be cured. You can't tell that from the median or 5 year survival or from any other single point. It's the shape of the survival curve that tells this story.

Read: http://cancerguide.org/scurve_basic.html for a really good summary of survival curves.

What does relative risk reduction mean?

The relative risk, relative risk reduction, or risk ratio, is the ratio of the risk of an event in experimental group compared to the control group i.e. RR = EER/CER. The RRR is the proportional reduction seen in an event rate between the experimental and control groups. For example if a drug r￼￼￼￼￼reduces your risk of an MI from 6% to 3%, it halves your risk of a MI i.e. the RRR is 50%. But note that the ARR is only 3%.

Out of interest...

Relative risks and odds ratios are used in meta-analyses as they are more stable across trials of different duration and in individuals with different baseline risks. They remain constant across a range of absolute risks. Crucially, if not understood, they can create an illusion of a much more dramatic effect. Saying that this drug reduces your risk of a MI by 50% sounds great; but if your absolute risk was only 6%, this is the same as reducing it 3%. So, saying this drug reduces your risk by 50% or 3% are both true statements but sound very different, so guess which one that drug companies tend to prefer!

Define a hazard ratio

A way of expressing the relative risk of an adverse event i.e. if an adverse event was twice as likely to happen with a particular intervention, it would have a HR of 2.

Define number needed to treat (NNT) and how to calculate it.

###
A clinically useful measure of the absolute benefit or harm of an intervention expressed in terms of the number of patients who have to be treated for one of them to benefit or be harmed.

Calculated as 1/ARR.

Example: The ARR of a stroke with warfarin is 2% (=2/100 = 0.02),

The NNT is 1/0.02 = 50.

e.g. Drug A reduces risk of a MI from 10% to 5%, what is the NNT?.

The ARR is 5% (0.05), so the NNT is 1/0.05 = 20.

Define Absolute Risk Reduction (ARR)

CER - EER

- Absolute risk of an event happening (also called risk difference)

- Always expressed as a percentage.

- Looks at the difference between two event rates - i.e. absolute risk of death from MI + placebo is 5%. With a drug it's 3%. Thr ARR is 2%.

- Important to determine clinical relevance.

- (Absolute Risk Increase calculates an absolute difference in bad events happening in a trial ie when the experimental treatment harms more than the control).

Explain the concept of a likelihood ratio. How do you apply them as a bedside test?

The sensitivity and specificity of a test can be combined into one measure called the likelihood ratio. The likelihood ratio for a test result is defined as the ratio between the probability of observing that result in patients with the disease in question, and the probability of that result in patients with- out the disease.

LR = probability of a positive test / probability of a negative test.

For example, among patients with abdominal distension who undergo ultrasonography, the physical sign "bulging flanks" is present in 80% of patients with confirmed ascites and in 40% without ascites (i.e., the distension is from fat or gas). The LR for "bulging flanks" in detecting ascites, therefore, is 2.0 (i.e., 80% divided by 40%). Similarly, if the finding of "flank tympany" is present in 10% of patients with ascites but in 30% with distension from other causes, the LR for "flank tympany" in detecting ascites is 0.3 (i.e., 10% divided by 30%).

Easy recall

LR of 2 increases probability by 15%

LR of 5 by 30%

LR of 10 by 45%

For LRs between 0 and 1, use the inverse

1/2 = 0.5 - decreases probability by 15%

1/5 = 0.2 - decreases probability by 30%

1/10 = 0.1 - decreases probability by 45%

.

What does a p-value mean? What are the main influencing factors?

A measure that an event happened by chance alone e.g. p = 0.05 **means that there is a 5% chance or magnitude that the result occurred by chance**. For entirely arbitrary reasons p

The size of a P value depends on two factors:

1. The magnitude of the treatment effect (relative risk, hazard ratio, mean difference, etc)

2. The size of the standard error (which is influenced by the study size, and either the number of events or standard deviation, depending on the type of outcome measure used).

Very small P values (the easiest to interpret) arise when the effect size is large and the standard error is small.

Borderline P values can occur when there is a clinically meaningful treatment effect but a large or moderate standard error—often because of an insufficient number of participants or events (the trial is referred to as being underpowered).

This is perhaps the most common cause of borderline results. Borderline P values can also occur when the treatment effect is smaller than expected, which with hindsight would have a required a larger trial to produce a P value

##
Define positive and negative predictive value.

How do they differ from sensitivity and specificity?

The PPV is the percentage of patients who test positive for for a disease who really do have it out of the total positive, and the NPV is the percentage who test negative out of the total number of negative tests who really do not have it.

A/A+B

Depends on the background prevalence of the disorder in the population.

If a disease is rare, the PPV will be lower (but sensitivity and specificity remain constant). Often with tests the PPV is higher in a secondary care or sieved population than it is in primary care.

The likelihood ratio takes this into account and gives the most accurate information on test accuracy.

In an example using HIV with a 10% population prevalence, we had 9900 'true positive' test results – infected persons who tested positive – and 9000 false positive results. The positive predictive value in this case is (9900)/(9900 + 9000), or 52.4% or, nearly half of its positive results were false. In a subpopulation with higher HIV prevalence, the positive predictive value would be higher, as there would be more truly HIV-positive findings compared to the constant rate of false positive results.

The negative predictive value is defined as the proportion of persons with negative test results who are correctly diagnosed.

D/D+C

This value, too, depends on HIV prevalence. The negative predictive value is the number of persons correctly diagnosed as HIV-negative, divided by the total number of HIV-negative findings. The 81,000 'true negative' and 100 false negative results in our example yield a negative predictive value of (81,000/81,100), or over 99.9% – a very high likelihood that a negative result indicates a truly HIV-uninfected person.

What is the point of an ROC curve? How is it used?

The ROC curve is used to graph of Sensitivity vs the False positive rate or the sensitivity vs specificity

The AUC looks at the overall ability of the test to discriminate between those individuals with the disease and those without the disease.

A truly useless test (one no better at identifying true positives than flipping a coin) has an area of 0.5 (the red line is random). The best test has an area of 1 (which is the top left corner) - remember the AUC is the AUC from the red line.

If patients have higher test values than controls, then:

The area represents the probability that a randomly selected patient will have a higher test result than a randomly selected control.

If patients tend to have lower test results than controls:

The area represents the probability that a randomly selected patient will have a lower test result than a randomly selected control.

For example: If the area equals 0.80, on average, a patient will have a more abnormal test result than 80% of the controls.

If the test were perfect, every patient would have a more abnormal test result than every control and the area would equal 1.00.

If the test were worthless, half the controls would have a higher value than an actual diseases patient, and half would be lower, the AUC would be 0.5.

Define cumulative incidence. How does it differ from regular incidence?

###
Incidence is the number of new cases of a disease over time.

– Units include time

– Range is 0 to infinity

– Denominator is person-time

• Cumulative incidence is a proportion

– No units

– Range is 0 to 1

– Denominator is all at-risk in population

The cumulative incidence increases each year as the cases continue to accumulate, but the denominator for cumulative incidence – the initial population at risk – remains fixed.

• Incidence rate applies to a broader range of questions

• Kaplan-Meier provides a means to estimate cumulative incidence – censors those with incomplete follow-up

What does the term probability mean?

###
Probability of an event happening = Number of ways it can happen / Total number of outcomes

Probability can only ever be between 0 and 1.

For example - there are two ways a coin can land, heads or tails, it can go either way. There is a 1 in 2 or 1/2 chance of landing heads, and a 1/2 chance of landing tails. The probability of landing heads is 1 in 2.

The probability of a six sided dice landing a 4 is 1 in 6 or 1/6. There is only one way it can happen (there is only one 4 on the dice), vs 6 sides.

What does absolute risk reduction, or risk reduction mean?

Control event rate minus the experiment event rate (CER - EER)

The absolute risk is the actual, arithmetic risk of an event happening. The ARR (sometimes also called the Risk Difference) is the difference between 2 event rates e.g. AR of a MI with placebo over 5 years is 5% and with drug A is 3%, the ARR is simply 2%. This is the difference between the CER (control event rate) and the EER (experimental event rate).

e.g. Drug B reduces the chance of a stroke from 20% (CER) to 17% (EER). What is the ARR? Answer 3%.

Absolute risk increase (ARI) similarly calculates an absolute difference in bad events happening in a trial e.g. when the experimental treatment harms more patients than the control.

Knowing the absolute risk is essential when deciding how clinically relevant a study is.

##
Define subgroup analysis

What are the inherent problems with this?

What are the benefits?

- Participant data is split into subgroups to make comparisons between them, i.e by gender to compare differences within, or geographical locations.

- Used to investigate heterogenous results or to answer specific questions about patient groups or types of intervention.

- May be misleading – they are observational by nature, not randomised

- The more subgroup analyses there are, the higher the likelihood of false positives and negatives.

-Unexpected results from a subgroup analysis can be useful as a potential starting place for a subsequent clinical trial.

##
Does prespecifying a subgroup analysis help reduce the false positive/negative rate?

Why?

How can you address this?

-- Prespecified subgroup analysis does not prevent this, particularly if there are a large number of prespecified subgroup analyses (referred to as multiplicity). (If 20 subgroup analyses are prespecified, then it is expected that one of these subgroup analyses may show a false result for a P=.05 probability relationship.) For example, if the null hypothesis is true for each of 10 independent tests for interaction at the 0.05 significance level, the chance of at least one false positive result exceeds 40%.

- Multiplicity can be addressed by using criteria for statistical analysis that is more stringent than P=

##
What is the difference between prespecified subgroup analysis vs post-hoc analysis?

Is one better than the other?

Prespecified

- planned and documented before data examination.

- preferably included in study protocol

- includes endpoint, baseline characteristic, statistical method used.

Post-hoc

- hypotheses tested not specified prior to data examination

- unclear how many were undertaken

- unclear if motivated by post-hoc inspection of the data

However, **both prespecified and post hoc subgroup analyses are subject to inflated false positive rates arising from multiple testing**. Investigators should avoid the tendency to prespecify many subgroup analyses in the mistaken belief that these analyses are free of the multiplicity problem.

Define specificity.

###
Specificity is the proportion of people without the disease who test negative. A very specific test will have few false positives and be good at ruling a disease out. SpPIN means if a test is highly Specific (Sp) a Positive result rules the diagnosis in.

True negative / false positive + true negatives (d/b+d)

In other terms, if the test result for a highly specific test is positive you can be nearly certain that they actually have the disease.

Therefore, a test with 100% specificity correctly identifies all patients without the disease. A test with 80% specificity correctly reports 80% of patients without the disease as test negative (true negatives) but 20% patients without the disease are incorrectly identified as test positive (false positives).

A test with a high sensitivity but low specificity results in many patients who are disease free being told of the possibility that they have the disease and are then subject to further investigation. A good example is the D-Dimer which is sensitive but not specific - i.e about half of people who don't have the disease will test positive.

Define sensitivity

###
The sensitivity of a clinical test refers to the ability of the test to correctly identify those patients with the disease.

Sensitivity = true positives / true positives plus false negatives (a/a+c)

A test with 100% sensitivity correctly identifies all patients with the disease. A test with 80% sensitivity detects 80% of patients with the disease (true positives) but 20% with the disease go undetected (false negatives). A high sensitivity is clearly important where the test is used to identify a serious but treatable disease (e.g. cervical cancer). Screening the female population by cervical smear testing is a sensitive test. However, it is not very specific and a high proportion of women with a positive cervical smear who go on to have a colposcopy are ultimately found to have no underlying pathology.

How do you calculate and interpret a positive likelihood ratio?

LR+ = The probability of an individual with disease having a positive test / The probability of an individual without disease having a positive test

You will notice that the numerator in this equation is exactly the same as the sensitivity of the test, and the denominator is the converse of specificity (1 − specificity). Thus the LR+ of a test can simply be calculated by dividing the sensitivity of the test by 1− specificity (Sensitivity/1 − specificity).

LR+s greater than 1 mean that a positive test is more likely to occur in people with the disease than in people without the disease. LR+s less than 1 mean that a positive test is less likely to occur in people with the disease compared to people without the disease. Generally speaking, for patients who have a positive result, LR+s of more than 10 significantly increase the probability of disease (‘rule in’ disease) whilst very low LR+s (below 0.1) virtually rule out the chance that a person has the disease

>p>

How do you calculate and interpret a negative likelihood ratio?

###
LR− =The probability of an individual with the disease having a negative test / the probability of an individual without the disease having a negative test

The numerator in this equation is the converse of sensitivity (1 − sensitivity), and the denominator is equivalent to specificity. Thus the LR− of a test can be calculated by dividing 1 − sensitivity by specificity (1 − Sensitivity/Specificity).

LR−s greater than 1 mean that a negative test is more likely to occur in people with the disease than in people without the disease.

LR−s less than 1 mean that a negative test is less likely to occur in people with the disease com- pared to people without the disease.

Generally speaking, for patients who have a negative test, LR−s of more than 10 significantly increase the probability of disease (rule in dis- ease) whilst a very low LR− (below 0.1) virtually rule out the chance that a person has the disease.

LR+ = sensitivity

1-specificity

LR- = 1-sensitivity

specificity

##
What is pre and post-test probability?

What else do you need to estimate post-test probability, how do you do it, and what is it called?

###
The estimated probability of disease before the test result is known, is referred to as the pre-test probability, which is usually estimated on the basis of the clinician’s personal experience, local prevalence data and published reports.

The patient’s probability or chance of having the disease after the test results is known is referred to as the post-test probability. The post-test probability of disease is what clinicians and patients are most interested in as this can help in deciding whether to confirm a diagnosis, rule out a diagnosis or perform further tests.

According to the Bayes theorem, the post-test odds that a patient has a disease is obtained by multiplying the pre-test odds by the likelihood ratio of the test

Post−test odds = pre−test odds × likelihood ratio

Post-test odds are different to probability but can be converted.

##
What is Fagan's nomogram?

How is it used?

###
The Fagan’s nomogram is a graphical tool which, in routine clinical practice, allows one to use the results of a diagnostic test to estimate a patient’s probability of having disease. In this nomogram, a straight line drawn from a patient’s pre-test probability of disease (left axis) through the likelihood ratio of the test (middle axis) will intersect with the post-test probability of disease (right axis).

Hypothetical example

In a hypothetical population, the prevalence of Disease A was 10%, which means that when we randomly select a person from this population, his or her chance of having Disease A (pre-test probability) is 10%. The LR+ of Test A was earlier calculated to be about 13. As shown in Figure 2, when we draw a straight line from the pre-test probability of 10% through the likelihood ratio of 13, the line intersects with the post-test probability of about 60%.

This means that the probability of Disease A for a person in this hypothetical population increases from 10% to 60% when he or she has had a positive result for Test A.

In the same way, we can also estimate the post-test probability of a person in this population who has a negative result. You will recall that the LR− of Test A was earlier calculated to be 0.21. Joining the pre-test probability of 10% to the likelihood ratio of 0.21 on the Fagan’s nomogram, we read off a post-test probability of about 2% (Fig. 3). This means that after a negative test, a person in this population’s chance of having Disease A reduces from 10% to 2%.

A certain autosomal recessive disorder affects 1 in 1600 people; the carrier frequency is 5%. A DNA assay can identify the mutation in 80% of carriers; the false-positive rate of this assay is zero.

What is the best estimate of the positive predictive value (PPV) and negative predictive value (NPV) of this assay in screening the population for carriers?

PPV NPV

A 20% 100%

B 80% 80%

C 100% 80%

D 100% 99%

E 100% 100%

Answer: D

Question 47 AMP2007a

A test has a sensitivity of 95% and a specificity of 80%. It is used to screen for a condition with a prevalence of 1 in 100.

What will the positive predictive value be nearest to?

A. 0.2%.

B. 0.5%.

C. 1%.

D. 2%.

E. 5%.

E. 5%.

2006a Q33

A test has a sensitivity of 95% and a specificity of 90%. It is used to screen the general population for a rare condition that has a prevalence of 1 in 100,000.

What will the positive predictive value be nearest to?

A. 0.01%.

B. 0.05%.

C. 0.1%.

D. 0.5%.

E. 1%.

A. 0.01%.

2005aQ16

If the pre-test probability of a condition is known, which of the following is also needed to be able to estimate the post-test probability?

A. sensitivity.

B. specificity.

C. accuracy.

D. likelihood ratio.

E. odds.

###
D. likelihood ratio.

2006b Q47

##
An asymptomatic 45-year-old male with no risk factors for coronary artery disease (CAD) is sent by his general practitioner to have an exercise ECG. At peak exercise, the heart rate is 160/minute and there is 0.7 mm of horizontal ST depression. The relationship between pre-test and post-test probability of significant coronary artery disease is shown below according to the ECG changes occurring at maximal exercise.

What is the probability that this man has significant CAD?

A. 41-50%.

B. 31-40%.

C. 21-30%.

D. 11-20%.

E. 1-10%.

*2005aQ46 *

###
E. 1-10%.

##
A test is used to screen for a target disorder.

The best definition of the specificity of the test is "the proportion of individuals:

A. with the target disorder who have a positive test result".

B. without the target disorder who have a negative result".

C. with a positive test result who have the target disorder".

D. with a negative test result who do not have the target disorder".

E. with or without the target disorder who have a correct result".

B. without the target disorder who have a negative result”.

##
The following table shows the results of the primary endpoint of a study of a new treatment for cholesterol, "notzofat". Patients were recruited in a tertiary referral hospital after a confirmed myocardial infarction.

If the same treatment and endpoint was used in a trial in a group of patients with asymptomatic hypercholesterolemia detected at screening, which of these measures of treatment effect is most likely to remain approximately the same?

A. Absolute risk reduction.

B. Cost-benefit analysis.

C. Risk-benefit analysis.

D. Number needed to treat.

E. Relative risk reduction.

2005b Q6

###
E. Relative risk reduction.

Relative risks and odds ratios are used in meta-analyses as they are more stable across trials of different duration and in individuals with different baseline risks. They remain constant across a range of absolute risks. Crucially, if not understood, they can create an illusion of a much more dramatic effect. Saying that this drug reduces your risk of a MI by 50% sounds great; but if your absolute risk was only 6%, this is the same as reducing it 3%. So, saying this drug reduces your risk by 50% or 3% are both true statements but sound very different, so guess which one that drug companies tend to prefer!

##
In a clinical trial presenting two survival curves, the absolute benefit of treatment is best described by which of the following?

A. The median increase in survival time.

B. The relative risk reduction.

C. The hazard ratio.

D. The number needed to treat.

E. The absolute risk reduction.

2007a Q9

###
A. The median increase in survival time.

The median survival is the time at which the percentage surviving is 50%. If more than half the patients are cured, there is no such point on the survival curve and the median is undefined (and often described as greater than the longest time on the curve).

Curves which flatten to a level plateau, suggest that patients are being cured, and curves which descend all the way to zero, imply that no one (or almost no one) is cured.

##
An effective treatment is available for your patient, which you believe has advantages over existing treatments. It is expensive and has no funding source, so if selected, your patient would have to pay.

The most appropriate approach is:

A. not to disclose unfunded option to avoid patient’s financial harm.

B. not to disclose unfunded option as all treatments should be funded.

C. to disclose unfunded option to offer informed choice.

D. to disclose if you judge the patient to be able to pay.

E. to recommend unfunded option to give patient the best treatment.

2007b Q28

###
C. to disclose unfunded option to offer informed choice.

Informed Consent

In the United States, Australia, and Canada, a more patient-centered approach is taken and this approach is usually what is meant by the phrase "informed consent." Informed consent in these jurisdictions requires that significant risks be disclosed, as well as risks which would be of particular importance to that patient. This approach combines an objective (the reasonable patient) and subjective (this particular patient) approach.

An informed consent can be said to have been given based upon a clear appreciation and understanding of the facts, implications, and future consequences of an action. In order to give informed consent, the individual concerned must have adequate reasoning faculties and be in possession of all relevant facts at the time consent is given. Impairments to reasoning and judgment which may make it impossible for someone to give informed consent include such factors as basic intellectual or emotional immaturity, high levels of stress such as PTSD or a severe intellectual disability, severe mental illness, intoxication, severe sleep deprivation, Alzheimer's disease, or being in a coma.

##
The most appropriate methodology to evaluate activity of a cytotoxic cancer treatment is:

A. in-vitro testing.

B. phase I study.

C. phase II study.

D. phase III study.

E. phase IV study.

###
C. phase II study.

In Phase 2 trials, the experimental treatment is given to a larger group of people (100–300) to see if it is effective and to further evaluate its safety.

What happens in a Phase 1 trial?

In Phase 1 trials, researchers test an experimental drug or treatment in a small group of people (20–80) for the first time to evaluate its safety, determine a safe dosage range, and identify side effects.

What happens in a phase 2 trial?

In Phase 2 trials, the experimental treatment is given to a larger group of people (100–300) to see if it is effective and to further evaluate its safety.

What happens in a phase 3 trial?

In Phase 3 trials, the treatment is given to large groups of people (1,000–3,000) to confirm its effectiveness, monitor side effects, compare it to commonly used treatments, and collect information that will allow it to be used safely.

What happens in a phase 4 trial?

In Phase 4 trials, postmarketing studies delineate additional information, including the treatment's risks, benefits, and optimal use.

What is the purpose of pre-clinical trials?

###
Pre-clinical trials (in-vitro and in-vivo)

- toxicity

- long term carcinogenic or other toxic efects

- pharmacodynamic studies

- pharmacokinetic studies

- determine safe starting dose.

A randomized, double-blind controlled trial compared the rate of major coronary events in diabetic patients treated with fenofibrate or placebo and followed for up to six years. The trial was negative on intention-to-treat analysis (event rates 5.2% and 5.9% respectively, p=0.16). The figure shows drop- out from and drop-in to other lipid-modifying treatment during the trial.

Assuming fenofibrate is efficacious, which of the following is most likely to explain the negative trial result?

A. The study was too small.

B. The drop-out rate was too high.

C. The drop-in rate was imbalanced.

D. The event rate was too low.

E. Treatment groups were not matched at entry.

2007a Q7.

C. The drop-in rate was imbalanced.

The drop in rate to placebo was twice that to the treatment, resulting in a false diminishing of its effect. This is an example of a type 2 error, or false negative result.

------------------------

A. Wrong. The study included over 4000 people so was likely high enough.

B. Wrong. The drop-out rate stayed above 80% for the 6 years so was not too high.

D. Wrong. Event rates of 5% are acceptable and the rates are nearly equal for the control and event groups.

E. Wrong. The trial has been randomized (and the fact that it is randomized, double blinded and analysed per intention-treat suggests it’s of high quality) which should eliminate differences between the groups.

Type 1 error - a positive result when there is no real difference. This is a false positive

Type 2 error - no significant difference is found when there is actually a real treatment difference. This is a false negative

Small studies with a wide CI are prone to these errors.

A new D-dimer assay has a sensitivity for deep venous thrombosis (DVT) of 95% and a specificity of 50%. It is proposed to use it to screen a group of passengers after long distance travel. Previous data have suggested this group has a pre-test probability of DVT of 1%.

What is the best estimate of the post-test probability of DVT in an individual with a positive D-dimer result?

A. 2%.

B. 5%.

C. 10%.

D. 50%.

E. 95%.

2004aQ55

A. 2%.

LR = **Sensitivity/1 – specificity **

or 1 (sensitivity) / 0.5 (1-specificity) = 2

Posttest odds = **Pretest odds * Likelihood ratio**

0.1 (pretest probability) x 2 (LR) = 0.2 = 2%

The following table shows the results of a study of a new diagnostic screening test for colon cancer performed in a tertiary referral hospital on patients referred with rectal bleeding.

**Subsequent diagnosis on colonoscopy**

Yes No

Positive test 40 5

Negative test 10 20

When the same test is applied to a group of asymptomatic patients in general practice, which of the following will be most likely to increase?

A. Sensitivity.

B. Specificity.

C. Negative likelihood ratio.

D. Positive predictive value.

E. Negative predictive value.

###
E. Negative predictive value.

The negative predictive value NPV of the test is is d/c+d

The PPV is the percentage of patients who test positive for Bordetella who really do have it, and the NPV is the percentage who test negative who really do not have it.

The negative predictive value is defined as the proportion of persons with negative test results who are correctly diagnosed. This value depends on prevalence. The negative predictive value is the number of persons correctly diagnosed as HIV-negative for example, divided by the total number of HIV-negative findings. 81,000 'true negative' and 100 false negative results yield a negative predictive value of (81,000/81,100), or over 99.9% – a very high likelihood that a negative result indicates a truly HIV-uninfected person.

##
2003a Q8

A new diagnostic test for a certain disease has been evaluated. Compared with the definitive diagnostic standard, this test has a sensitivity of 100% and a specificity of 95%. The prevalence of the disorder in the population to be tested is 0.1%.

What is the best estimate of the positive predictive value of the new test?

A. B. 2%.

C. 5%.

D. 10%.

E. 25%.

###
B. 2%.

PPV = A/A+B

Assuming a disease prevalence of 1/100

A = 1

D = 94

B = 5

C = 0

1/1+6 = 1/6 = 0.15 which is closest to 2%

The PPV is the percentage of patients who test positive for a disease who really do have it, and the NPV is the percentage who test negative who really do not have it. Importantly these depend on the background prevalence of the disorder in the population. If a disease is rare, the PPV will be lower (but sensitivity and specificity remain constant). So often with tests the PPV is higher in a secondary care or sieved population than it is in primary care. The likelihood ratio takes this into account and gives the most accurate information on test accuracy.

Below are the tabulated results of five clinical trials for different drugs using death as the primary outcome. Follow-up time is five years for all trials.

Trial for: # (%) surviving on treatment "150" align="center"># (%) surviving on placebo

Drug A (n=200)30 (15%)20 (10%)

Drug B (n=600) 12 (2%)3 (0.5%)

Drug C (n=400)80 (20%)64 (16%)

Drug D (n=500)75 (15%)55 (11%)

Drug E (n=300)30 (10%)18 (6%)

n = the total number of patients entered in each arm of the trial

The trial for which drug shows the lowest number needed to treat (NNT)?

A. Drug A.

B. Drug B.

C. Drug C.

D. Drug D.

E. Drug E.

A. Drug A.

ARR = 5% = 0.05

NNT = 1/ARR = 1/0.05 = 20

B. ARR =1.5% = 0.015 NNT 1/0.015 = 100

C, D & E. ARR = 4% = 0.04 NNT = 1/0.04 = 25**Number Needed to Treat**

A clinically useful measure of the absolute benefit or harm of an intervention expressed in terms of the number of patients who have to be treated for one of them to benefit or be harmed.

Calculated as 1/ARR.

Example: The ARR of a stroke with warfarin is 2% (=2/100 = 0.02),

The NNT is 1/0.02 = 50.

e.g. Drug A reduces risk of a MI from 10% to 5%, what is the NNT?.

The ARR is 5% (0.05), so the NNT is 1/0.05 = 20.**ARR**

The absolute risk is the actual, arithmetic risk of an event happening. The ARR (sometimes also called the Risk Difference) is the difference between 2 event rates e.g. AR of a MI with placebo over 5 years is 5% and with drug A is 3%, the ARR is simply 2%. This is the difference between the CER (control event rate) and the EER (experimental event rate). Knowing the absolute risk is essential when deciding how clinically relevant a study is. Absolute risk increase (ARI) similarly calculates an absolute difference in bad events happening in a trial e.g. when the experimental treatment harms more patients than the control. e.g. Drug B reduces the chance of a stroke from 20% (CER) to 17% (EER). What is the ARR? Answer 3%.

Remember that it is expressed as a percentage.

2003a Q48

A drug trial is designed to compare the efficacy and safety of a new anti-hypertensive drug (not yet marketed) to an established drug from the same class. The trial has a randomised, parallel group design with a three-month treatment period, and is to be carried out with a total of 1000 subjects in each group, spread across 50 trial centres.

This type of trial is best described as belonging to which phase of drug development?

A. Preclinical efficacy studies.

B. Clinical phase I studies.

C. Clinical phase II studies.

D. Clinical phase III studies.

E. Clinical phase IV studies.

D. Clinical phase III studies.

Phase 0 trials are the first-in-human trials. Single subtherapeutic doses of the study drug are given to a small number of subjects (10 to 15) to gather preliminary data on the agent's pharmacodynamics (what the drug does to the body) and pharmacokinetics (what the body does to the drugs).

In Phase 1 trials, researchers test an experimental drug or treatment in a small group of people (20–80) for the first time to evaluate its safety, determine a safe dosage range, and identify side effects.

In Phase 2 trials, the experimental treatment is given to a larger group of people (100–300) to see if it is effective and to further evaluate its safety, usually against a placebo.

In Phase 3 trials, the treatment is given to large groups of people (1,000–3,000) to confirm its effectiveness, monitor side effects, compare it to commonly used treatments, and collect information that will allow it to be used safely.

In Phase 4 trials, postmarketing studies delineate additional information, including the treatment's risks, benefits, and optimal use.

Pre-clinical trials (in-vitro and in-vivo)

Typically, both in vitro and in vivo tests will be performed. Studies of a drug's toxicity include which organs are targeted by that drug, as well as if there are any long-term carcinogenic effects or toxic effects on mammalian reproduction.

2003 Paper 1

A disease has an annual incidence of 15 cases per 100,000. The mean survival after diagnosis is five years.

What is the best estimate of the prevalence of this disorder?

A. 3 per 100,000.

B. 10 per 100,000.

C. 20 per 100,000.

D. 45 per 100,000.

E. 75 per 100,000.

E. 75 per 100,000.

There are 15 new cases per year (incidence) out of 100,000 which roughly survive up to five years. At any given point in time, you should assume 5 years worth of the new cases are still alive, so out of 100,000 people in any 5 year period, 5x15 or 75 per 100,000 will be around (prevalence)

Prevalence

- how commonly a disease occurs in a population at a given point in time

- number with disease / number examined

- in this case - 15 people per year accumulate up to 5 years, so at any one time within a 5 year period, there should be 75 per 100,000 with the disease.

Incidence

The percent of a population who will develop a disease during a specified interval e.g. a study that found an incidence of chlamydia amongst new college students was 2% per annum, means that 2% of the students contracted chlamydia during the year.

Rate of occurrence of new cases of a disease / (specified time period / population size)

For example, the incidence of meningitis in the UK in 1999 could be calculated by finding the number of new meningitis cases registered during 1999 and dividing that number by the population of the UK. As this incidence rate would be very small again we tend to consider number of cases per 100,000 people.

Define incidence

**Incidence**

The percent of a population who will develop a disease during a specified interval e.g. a study that found an incidence of chlamydia amongst new college students was 2% per annum, means that 2% of the students contracted chlamydia during the year.

Rate of occurrence of new cases of a disease / (specified time period / population size)

For example, the incidence of meningitis in the UK in 1999 could be calculated by finding the number of new meningitis cases registered during 1999 and dividing that number by the population of the UK. As this incidence rate would be very small again we tend to consider number of cases per 100,000 people.

Define prevalence

**Prevalence**

- how commonly a disease occurs in a population at a given point in time

- number with disease / number examined

- in this case - 15 people per year accumulate up to 5 years, so at any one time within a 5 year period, there should be 75 per 100,000 with the disease.

Example. There are 15 new cases per year (incidence) out of 100,000 which roughly survive up to five years (mean survival of 5 years). At any given point in time, you should assume 5 years worth of the new cases are still alive, so out of 100,000 people in any 5 year period, 5 x 15 or 75 per 100,000 will be around (prevalence).

Define non-maleficence

###
Nonmaleficence

Primum non nocere - first, do no harm

Given an existing problem, it may be better not to do something, or even to do nothing, than to risk causing more harm than good." It reminds the health care provider that they must consider the possible harm that any intervention might do. It is invoked when debating the use of an intervention that carries an obvious risk of harm but a less certain chance of benefit

2002a Q31

A drug trial is designed to compare the efficacy and safety of a new antihypertensive drug (not yet marketed) to an established drug from the same class. The trial has a randomised, parallel group design with a three-month treatment period, and is to be carried out with a total of 1000 subjects in each group, spread across 50 trial centres.

This type of trial is best described as belonging to which phase of drug development?

A. Preclinical efficacy studies.

B. Clinical phase I studies.

C. Clinical phase II studies.

D. Clinical phase III studies.

E. Clinical phase IV studies.

D. Clinical phase III studies.

In Phase 3 trials, the treatment is given to large groups of people (1,000–3,000) to confirm its effectiveness, monitor side effects, compare it to commonly used treatments, and collect information that will allow it to be used safely.

2002a Q39

The principal aim of a phase I trial of a cytotoxic agent is to:

A. maintain the patient’s hope that treatment is possible.

B. determine the best schedule of administration.

C. define tumour response rate.

D. measure progression-free survival.

E. establish the maximum tolerated dose (MTD) .

E. establish the maximum tolerated dose (MTD) .

In Phase 1 trials, researchers test an experimental drug or treatment in a small group of people (20–80) for the first time to evaluate its safety, determine a safe dosage range (ie max tolerated dose), and identify side effects.

2002a Q65

The diagram above demonstrates cancer survival curves after treatment A and treatment B.

With regard to survival differences demonstrated by the curves, treatment A has:

A. longer median survival and better long-term survival.

B. same median survival and better long-term survival.

C. shorter median survival and better long-term survival.

D. longer median survival and worse long-term survival.

E. same median survival and worse long-term survival.

C. shorter median survival and better long-term survival.

The median survival is the time at which the percentage surviving is 50%. If more than half the patients are cured, there is no such point on the survival curve and the median is undefined (and often described as greater than the longest time on the curve). I like undefined medians!

However - a median survival or survival percentage at x years wont give you the full story. The drop off may continue past 5 years but then flatten out at 6 - which means no deaths. People fortunate enough to make it out to six or seven years may well be cured. You can't tell that from the median or 5 year survival or from any other single point. It's the shape of the survival curve that tells this story.

Curves which flatten to a level plateau, suggest that patients are being cured, and curves which descend all the way to zero, imply that no one (or almost no one) is cured.

2002b Q 49

The impact of an intervention in clinical trials and in systematic reviews can be expressed in a number of ways. One increasingly used format is the number needed to treat (NNT) which indicates how many patients have to be treated with the intervention of interest compared to the control intervention in order to achieve one successful outcome. In a recent systematic review of optimal self-management for adult asthma, the intervention was found to produce a 50% reduction in hospitalisation for asthma. Approximately 10% of patients in the control group required hospitalisation compared to approximately 5% of those who received optimal self-management.

Which one of the following is the best estimate of the number needed to treat for this intervention?

A. 1.

B. 2.

C. 5.

D. 20.

E. 50.

D. 20.

ARR = 5% = 0.05

NNT = 1/ 0.05 = 20

A clinically useful measure of the absolute benefit or harm of an intervention expressed in terms of the number of patients who have to be treated for one of them to benefit or be harmed.

Calculated as 1/ARR.

Example: The ARR of a stroke with warfarin is 2% (=2/100 = 0.02),

The NNT is 1/0.02 = 50.

e.g. Drug A reduces risk of a MI from 10% to 5%, what is the NNT?.

The ARR is 5% (0.05), so the NNT is 1/0.05 = 20.

ARR

The absolute risk is the actual, arithmetic risk of an event happening. The ARR (sometimes also called the Risk Difference) is the difference between 2 event rates e.g. AR of a MI with placebo over 5 years is 5% and with drug A is 3%, the ARR is simply 2%. This is the difference between the CER (control event rate) and the EER (experimental event rate). Knowing the absolute risk is essential when deciding how clinically relevant a study is. Absolute risk increase (ARI) similarly calculates an absolute difference in bad events happening in a trial e.g. when the experimental treatment harms more patients than the control. e.g. Drug B reduces the chance of a stroke from 20% (CER) to 17% (EER). What is the ARR? Answer 3%.

Remember that it is expressed as a percentage (and thus must be converted to a decimal before using NNT)

2002b Q92

In a randomised controlled trial of a cancer screening program, the most important indicator of the program’s effectiveness is demonstration in the screened subpopulation of improved:

A. case detection.

B. case detection at an earlier stage.

C. cancer-specific survival.

D. cancer mortality.

E. cancer progression-free survival.

D. cancer mortality.

Screening

Screening, in medicine, is a strategy used in a population to identify an unrecognized disease in individuals without signs or symptoms. This can include individuals with pre-symptomatic or unrecognized symptomatic disease. As such, screening tests are somewhat unique in that they are performed on persons apparently in good health.

Screening interventions are designed to identify disease in a community early, thus enabling earlier intervention and management in the hope to reduce mortality and suffering from a disease. Although screening may lead to an earlier diagnosis, not all screening tests have been shown to benefit the person being screened; overdiagnosis, misdiagnosis, and creating a false sense of security are some potential adverse effects of screening. For these reasons, a test used in a screening program, especially for a disease with low incidence, must have good sensitivity in addition to acceptable specificity.

World Health Organization guidelines, often referred to as Wilson's Criteria were published in 1968, but are still applicable today.

1.The condition should be an important health problem.

2.There should be a treatment for the condition.

3.Facilities for diagnosis and treatment should be available.

4.There should be a latent stage of the disease.

5.There should be a test or examination for the condition.

6.The test should be acceptable to the population.

7.The natural history of the disease should be adequately understood.

8.There should be an agreed policy on whom to treat.

9.The total cost of finding a case should be economically balanced in relation to medical expenditure as a whole.

10.Case-finding should be a continuous process, not just a "once and for all" project.

What is meant by the term generalisability (external validity) ?

External validity is the extent to which the results of a study can be generalized to other situations and to other people.

##
In clinical trials, the main purpose of randomisation is to:

A. remove bias in the allocation of treatment.

B. increase generalisability.

C. study interaction between treatments.

D. reduce bias in measurement of the outcome.

E. increase precision.

###
A. remove bias in the allocation of treatment.

Randomization

The thinking behind random assignment is that by randomizing treatment assignment, then the group attributes for the different treatments will be roughly equivalent and therefore any effect observed between treatment groups can be linked to the treatment effect and is not a characteristic of the individuals in the group.

To ensure that the groups being compared are equivalent, patients are allocatedto them randomly, i.e. by chance. If the initial selection and randomization is done properly, the control and treatment groups will be comparable at the start of the investigation; any differences between groups are chance occurrences unaffected by the conscious or unconscious biases of the investigators.

Define accuracy and precision

###
Systematic variation (bias) leads to results being inaccurate.

Random variation (chance) leads to results being imprecise.

For example, a huge observational study of 1000's of patients may produce results that are precise, but not accurate. Whereas a small, high quality randomised controlled trial may produce results that are accurate but not precise.

What is a confidence interval (confidence limit)? What do they tell us?

###
Confidence Intervals (also called Confidence Limits)

- range of numbers within there is a 95% chance the true result lies

- tells us if a result is statistically significant AND gives an idea of it’s precision

- i.e. NNT 30 (95% CI 12-56) means the result is 30 but 95% confident the true result is between 12-56. **i.e. if we repeated this trial 100 times, 95 times the result would lie between 12 and 56. **

- Non-significant if value includes ‘no effect’ result i.e. 1 for results expressed as ratios (eg RR, OR), and 0 for actual measurements (i.e. NNT, percentages, ARR)

- Wide CI = less precise result

- Narrow CI more precise result

- The larger the numbers in the study, the smaller the CI – the more valid it is.

- Meta-analyses are the best because they have the tightest CIs.

- Other factors affecting CI size: size of sample, confidence level, population variability (although a large sample size will better estimate the population parameter.

What is a Cox Hazard Model?

###
A Cox model is a statistical technique for exploring the relationship between the survival of a patient and several explanatory variables.

Survival analysis is concerned with studying the time between entry to a study and a subsequent event (such as death). A Cox model provides an estimate of the treatment effect on survival after adjustment for other explanatory variables.

In addition, it allows us to estimate the hazard (or risk) of death for an individual, given their prognostic variables.

Interpreting the Cox model involves examining the coefficients for each explanatory variable. A positive regression coefficient for an explanatory variable means that the hazard is higher, and thus the prognosis worse. Conversely, a negative regression coefficient implies a better prognosis for patients with higher values of that variable

##
The following table shows the results of a study of a new diagnostic screening test for multiple sclerosis.

Disease Present

Test resultYesNo

Positive12040

Negative60200

In this study, which of the following measurements of test performance has the highest calculated value?

A. Sensitivity.

B. Specificity.

C. Positive predictive value.

D. Negative predictive value.

E. Negative likelihood ratio.

###
B. Specificity.

Specificity is the probability of a negative test among patients without the disease. A very specific test will have few false positives and be good at ruling a disease out. SpPIN means if a test is highly Specific (Sp) a Positive result rules the diagnosis in.

The specificity of the test is d/b+d

Or true negatives / false positives + true negatives.

If you have no false positives (ie a highly specific test), then you can be confident that a positive is a true positive.

##
2008a Q58

In a clinical trial of memantine in Alzheimer’s disease, 252 patients were randomised to receive either memantine or placebo. There were two primary outcomes: the CIBIC-Plus (a clinician impression scale) and the ADCS-ADL (a scale of activities of daily living), and five secondary outcomes. One hundred and eighty-one subjects completed the trial. Base-line characteristics and results from the study are shown below.

The main reason that it is not possible from the results of this trial to conclude with certainty that memantine is either effective or ineffective is because:

A. P = 0.06 for one of the primary outcomes.

B. the drop out rate is too high.

C. the effect size is small.

D. there were too many secondary outcomes.

E. the groups were poorly matched.

###
B. the drop out rate is too high.

In the placebo group, roughly 50% dropped out, which is much too high. This should be less than 20%.

A is partly correct in that this represents borderline significance, but the trial is not huge, and there is a better answer here.

C is incorrect as the effect size is statistically significant for at least one of the measures.

D is incorrect - 10 secondary outcomes will give you a 40% chance of a meaningless result becoming significant, but five is pretty reasonable

E. Look at the groups between the intervention and placebo, they were pretty equally matched in spite of women outnumbering men.

In spite of the statistical signifcance, this is an example of Type 1 error - a falsely positive result.

##
A sales representative brings you the following abstract in an advertising pamphlet:

‘Fifty consecutive patients with dilated cardiomyopathy in NYHA class II-IV with a left ventricular ejection fraction (LVEF) of 35% or below were studied with full polysomnography over one night. The mean Apnoea-Hypopnoea Index of beta-blocker free patients was 19.8+/- 14.2 versus 7.4+/-8.5 (pThe sales representative suggests that beta blockers should be used in heart failure patients with sleep-disordered breathing, and that metoprolol is better than carvedilol in this situation.

Which is the strongest reason to disregard this advice?

A. The mean value may be misleading when the data range is wide.

B. The results of a single polysomnography study may not be reproducible.

C. In a nonrandomised study, differences may not be a treatment effect.

D. Patient characteristics are not given in sufficient detail.

E. The difference between carvedilol and metoprolol does not have a statistical comparison.

###
C. In a nonrandomised study, differences may not be a treatment effect.

All of the above are to some extent correct however the biggest effect will come from nonrandomisation - the effect could have come from some lurking variable that has not been accounted for by randomisation.

Randomization

The thinking behind random assignment is that by randomizing treatment assignment, then the group attributes for the different treatments will be roughly equivalent and therefore any effect observed between treatment groups can be linked to the treatment effect and is not a characteristic of the individuals in the group.

To ensure that the groups being compared are equivalent, patients are allocatedto them randomly, i.e. by chance. If the initial selection and randomization is done properly, the control and treatment groups will be comparable at the start of the investigation; any differences between groups are chance occurrences unaffected by the conscious or unconscious biases of the investigators.

##
2008b Q66

Many medical journals require pre-registration of drug trials as a pre-requisite to considering publication of trial results. Which is the most important reason for adopting this policy?

A. Reduction in publication bias.

B. Reduction in fraudulent data.

C. Independent data analysis.

D. Reduction in "ghost writing" of trial results.

E. Increased peer review of trial design.

###
A. Reduction in publication bias.

Publication bias occurs when the publication of research results depends on their nature and direction. This is usually a bias towards reporting significant results, despite the fact that studies with significant results do not appear to be superior to studies with a null result with respect to quality of design. In an effort to decrease this problem, some prominent medical journals require registration of a trial before it commences so that unfavorable results are not withheld from publication.

##
In a randomised controlled trial of an intervention to stop smoking in patients with chronic obstructive pulmonary disease (COPD), the use of an intention-to-treat analysis is most likely to cause which one of the following?A. Bias that overestimates the efficacy of the intervention.B. Bias that underestimates the efficacy of the intervention.C. Bias due to selection of subjects.D. Bias due to measurement of the outcome.E. Bias due to known and unknown confounders.

Answer: B.

Define odds

###
Odds - remember a ratio of the number of people who incur a particular outcome to the number of people who do not incur the outcome

NOT a ratio of the number of people who incur a particular outcome to the total number of people

Whats the difference between odds and probability?

###
Odds is the chance of something happening c/w chance it is not happening. eg) what's the odds that the day of the week is Sunday? OR 1 : 6 (1 chance it is Sunday, 6 chances it is not)Probability 1 : 7 (1 chance it is Sunday, out of 7 possibilities)

Does an odds ratio of > 1 or < 1 correlate with reduced risk?

###
OR <1 = low riskOR > 1 = increased risk (more likely to occur)

What is Attributable and Population-attributable Risk?

###
Attributable risk is the difference in rate of a condition between an exposed population and an unexposed population. Attributable risk is mostly calculated in cohort studies, where individuals are assembled on exposure status and followed over a period of time

Attributable risk = Risk difference
= Incidence rate of the disease in the exposed group
− Incidence rate of the disease in the unexposed group
= a/(a+b) − c/(c+d)

Population attributable risk = Incidence rate in the exposed group
− Incidence rate in the total population
= Attributable risk×Prevalence of the exposure in the population

Describe the levels and grades of evidence