Prevalence

of existing cases of a disease at a specified time / # of people in base population at that time

Incidence

new cases occurring in a specific time period / # people initially at risk

Case fatality

of people who die of a disease / total # people with the disease

Mortality

people dying of a disease in a specified time period / # people alive during that time period

Years of Potential Life Lost (YPLL)

Calculated by multiplying the number of cause-specific deaths in an age group by the difference between the midpoint of the age group and the average age at death (assumed to be 75)

Reasons for an association betweeen a factor and a disease

Bias in the sampling of subjects Bias in the measurement of the factor Confounding Chance Transposition of cause and effect Causal

Relative Risk

Incidence of disease in the exposed / incidence of disease in the unexposed

Ex: 3.5x more likely to have cancer if you smoke

Attributable Risk

The difference between incidence of the disease in individuals with a risk factor and in those without a risk factor

AR = risk in the exposed - risk in the unexposed

Ex: In population X, 500 per 1,000 cases are due to smoking

Number Needed to Treat (NNT)

NNT = 1 / attributable risk

If 10 cases out of 50 are due to risk factor X, then NNT = 1 / (10/50) = 5 needed to treat in order to gain 1 outcome

Population Attributable Risk (PAR)

The proportion of cases that would be prevented if the risk factor could be eliminated from the population

PAR = total incidence - incidence in unexposed / total incidence

Cross-Sectional study

Usually large surveys of a representational sample group; assesses both risk factors and disease status in the present in order to draw correlations; can give information about the relative risk for certain diseases given exposure to different risk factors

Case-control (retrospective study)

A sample of cases and controls are examined for past history of risk factors; association of the disease with the risk factor is assumed to correlate with an association between the risk factor and the disease

Susceptible to sampling bias and re-call bias

Cohort Study

AKA Prospective, longitudinal study

A cohort (exposed and unexposed) is assembled and followed over time to determine who develops disease

Limited utility for very rare diseases or very long latencies, vulnerable to bias if loss-to-follow up is unequal in exposure vs. unexposure groups

Randomized Controlled Trial (RCT)

Participants are randomized to trial arm (exposed) or control arm (unexposed) and followed to assess outcomes

High internal validity, lower external validity (groups were artificially designed and so are not representative)

Vulnerable to cross-over; must use an intent-to-treat analysis

Sensitivity

Describes how good a screening test is at detecting true positives.

Sensitivity = TP / (TP + FN)

Specificity

Describes how often a test shows true negatives.

Specificity = TN / (TN + FP)

Positive Predictive Value (PPV)

Describes how often individuals with positive test results actually have the disease

PPV = TP / (TP + FP)

PPV increases with higher disease prevalence. Increased test specificity increases PPV.

Negative Predictive Value (NPV)

Describes how often individuals with negative test results are actually disease-free

NPV = TN / (TN + FN)

NPV increases with lower disease prevalence and increased sensitivity.

Lead-time bias

If a new screening test diagnoses a disease one year earlier but treatment has no effect on survival, data will show a false 1-year improvement in survival; this is an artifact of lead-time bias

Length time bias

Screening tests are likely to pick up individuals with longer asymptomatic phases (because symptomatic patients have already been diagnosed, we assume); therefore, those cases detected by screening are more likely to have longer disease regardless of whether it is detected by screening or not

2 methods for combining screening tests

- Parallel testing - The overall screening result is positive if any one test is positive; increases sensitivity at the expense of specificity
- Series testing - The overall screening result is positive only if all tests are positive; increases specificity at the expense of sensitivity

2 types of quantitative variables

- Continuous, i.e. weight, BP, serum levels, etc.

Continuous data are compared with t-test - Categorical (ordinal), i.e. no disease vs. disease

Categorical data are compared with Chi Square test

Mode

The most commonly observed value

Median

The middle value in a data set arranged lowest to highest

Mean

The arithmetic average

Mean = sum(x) / n

Range

The highest value minus the lowest value

Variance

Variance = sum (x - mean)^2 / (n - 1)

Standard deviation

The square root of the variance

Characteristics of a normal distribution

Mean = median

67% of values lie within 1 standard deviation from the mean

95% of values lie within 2 standard deviations of the mean

Type I error

False Positive

The investigator wrongly concludes that there is a difference, when the difference actually occurred as a result of chance or bias

Type II error

False Negative

The investigator concludes that there is no difference, when actually there is a true difference obscured by chance or bias

P value

The probability that chance alone caused the observed association

.05 is a conventional threshold

Confidence Interval

The range of values within which you are X% confident that the true value lies

95% CI identical to P = .05

Effect Modification

Occurs when the effect of a risk factor on an outcome is different at different levels of a third factor; the third factor is known as the effect modifier

Ex: age, gender

Beta Value

Beta is the acceptable risk of making a type II (false negative) error, arbitrarily set at .2

Power = 1 - B x 100

Therefore, a study that accepts the null should be able to shower a power of 80% or higher