# Biostats & Epi for PM Flashcards

1
Q

Fetal Death Rate Equation

A

total number of fetal deaths in a given time period/total number of live births during the same period of time x 1000

2
Q

Infant Mortality Rate Equation

A

total number of deaths of infants (<1 y/o) in a time period/total number of live births during the same period x 1000

3
Q

Maternal Mortality Rate Equation

A

deaths due to pregnancy related illness in a given time period/total number of live births during the same period of time x 100,000

4
Q

Neonatal Mortality Rate Equation

A

total number of deaths of neonates (<28 days old) in a given time period/total number of live births during the same period of time x 1000

5
Q

Perinatal Mortality Rate Equation

A

neonatal deaths + fetal deaths in a given time period/total number live births and fetal deaths during the same time period x 1000

6
Q

ecological fallacy definition

A

an association at the population level is not necessarily true at the individual level

7
Q

studies with ecological fallacy

A

cross-sectional studies

8
Q

vital statistics recorded (4)

A

birth, death, marriage, divorce

9
Q

length bias definition

A

when a less aggressive disease appears to have a higher incidence because slower-moving diseases are more likely to be detected

10
Q

non differential bias is the same as

A

random error

11
Q

A

appearance that early diagnosis of a disease prolongs survival

12
Q

Hawthorne effect definition

A

individual behavior changes when a person knows they are being observed

13
Q

regression to the mean definition

A

the further a value is from the mean, the more likely future recordings are closer to the mean

14
Q

Neyman bias definition

A

selective survival bias

cases in a study have different exposures than the ones that die

15
Q

When does stratification reduce confounding?

A

analysis stage

16
Q

3 ways to reduce confounding during the design stage

A

randomization
restriction
matching

17
Q

3 ways to reduce confounding during the analysis stage

A

standardization
stratification
statistical modeling

18
Q

Bayes theorem equation

A

(prevalence)(sensitivity)/(prevalence)(sensitivity) + [(1-prev)(1-specificity)]

19
Q

incidence density definition

A

number of new cases of a disease per summation of time that each person is at risk of a disease in a specified time and place

20
Q

incidence density equation

A

new cases/sum of person-time

21
Q

central limit theorem definition

A

when there are a large amount of mutually independent random variables, the mean population will approach normal distribution (n >30)

22
Q

IQ mean and SD

A

100 +/- 15

23
Q

z-score definiton

A

how many standard deviations are between an observed value and the mean

24
Q

z-score equation

A

observed value - mean / standard deviation

25
Q

A

event 1 + event 2 - (event 1 and event 2 overlap) = probability
used for non-mutually exclusive events

26
Q

standard mortality ratio equation

A

observed # of deaths/expected # of deaths x 100

27
Q

A

when you use a second population to extrapolate estimates

28
Q

null hypothesis definition

A

there is no difference between the variables being tested

29
Q

type 1 error definition

A

when a null hypothesis is rejected when it is actually true (ex. false-positives)

30
Q

type 2 error definition

A

when a false null hypothesis is not rejected (ex. false negatives)

31
Q

confidence interval equation

A

mean +/- 1.96(std dev/sq root N)

32
Q

as prevalence increases, PPV _____ and NPV ____

A

increases, decreases

33
Q

power equation

A

1 - beta = 1- the probability of rejecting the null when the null is true

34
Q

3 ways to increase power

A

increase sample size
decrease beta
increase threshold of Ho

35
Q

NNT equation

A

1/ARR = 1/risk exposed - risk unexposed

36
Q

NNH Equation

A

1/absolute risk increase

37
Q

9 components to determine causality

A
1. consistency of association
2. strength of association
3. specificity
4. temporal factors
5. coherence of explanation
6. biological plausibility
7. experimental evidence from a controlled trial
8. dose-response relationship
9. analogy
38
Q

Standard error equation

A

std dev/sq root n

39
Q

internal validity definition

A

how well a study represents the true association within a study

40
Q

external validity definition

A

how well the results of a study are generalizable to a different population

41
Q

degrees of freedom equation

A

(rows-1)(columns-1)

42
Q

chi squared equation

A

sum of (observed data-expected data)sq/expected data

expected= (rows)(columns)/total

43
Q

Kappa equation

A

observed agreement/chance agreement/total number-chance agreement

observed: agreed true + agreed false
cell agreement due to chance = (row total)(column total)/(total number)
chance agreement = TT chance + FF chance

44
Q

F test

A

part of ANOVA

45
Q

confounder definition

A

3rd variable associated with the exposure and the outcome

obscured the relationship between the exposure and outcome

46
Q

effect modifier definition

A

changes the relationship between exposures and outcomes

47
Q

intervening variable defintion

A

a mechanism by which a causal variable leads to an outcome

48
Q

necessary cause definition

A

required for disease to occur but may not invariable lead to disease

49
Q

sufficient cause definition

A

50
Q

coefficient of determination definition

A

the proportion of variation of a dependent variable that can be explained by an independent variable

51
Q

3 examples of time-series analysis

A

cohort studies
epidemic studies
longitudinal data

52
Q

McNemar’s Test definition

A

chi-sq test for non-independent variables, allows you to analyze matched pairs or calculate before and after in the same variable

53
Q

Mann-Whitney U test definition

A

tests the median between two groups, the nonparametric version fo the t-test

54
Q

attributable risk equation

A

a/a+b - c/c+d

55
Q

relative risk equation

A

(a/a+b)/(c/c+d)

56
Q

OR equation

A

(a/c)/(b/d)

57
Q

25th percentile calculation

A

(n+1)/4

58
Q

sign test defintiion

A

nonparametric test that compared dichotomous differences in data from matched otherwise identical pairs, ignored magnitude of difference

59
Q

Nonparametric version of t-test

A

mann-whitney U test

wilcoxon rank-sum test

60
Q

Nonparametric version of paired t-test

A

Wilcoxon signed rank test

sign test

61
Q

Nonparametric version of ANOVA

A

Kruskal-wallis test

62
Q

Nonparametric version of Pearson correlation

A

spearman correlation

chi-sq

63
Q

regular categorical variable example

A

group names, M/F

64
Q

ordinal variable definition

A

group names with an order, ex. cancer stage

65
Q

continuous variable definition

A

measurements, ex. height/weight

66
Q

discrete numeric variable example

A

counts, ex. number of crashes at an intersection

67
Q

interval variable definition

A

continuous variable with no true zero

68
Q

ratio variable definition

A

continuous variable with a true 0

69
Q

variance equation

A

average squared distance from the mean

70
Q

standard deviation equation

A

square root of variance

71
Q

right skew effect on measures of central tendency

A

mean > median

tail goes to the right

72
Q

left skew effect on measured of central tendency

A

mean < median

tail goes to the left

73
Q

geometric mean for skewed data equation

A

mean of logs = e^mean

74
Q

coefficient of variation equation

A

ratio of std dev to the mean x 100

SD/mean x 100

75
Q

2 uses of coefficient of variation

A
1. compare relative data spread for 2 variables

2. evaluate precision of the measurement of a single variable

76
Q

z score definition

A

number of standard deviations a value is away from the mean

77
Q

percentile of z=0

A

50th percentile

78
Q

percentile of z=1

A

84th percentile

79
Q

percentile of z=2

A

97.5th percentile

80
Q

z score equation

A

z = obs value - known sample mean / population std dev

81
Q

4 types of random samples

A

simple random sample
stratified random sample
cluster random sample
systematic random sample

82
Q

central limit theorem

A

distribution of sample means is approximately normal if the sample size is large enough (N~=30)

83
Q

standard deviation of distribution of the sample mean equation

A

AKA standard error

std dev/sq root N

84
Q

95% CI equation

A

sample mean +/- 2(pop sd/sq root sample size)

85
Q

two sided null and alternative hypotheses

A

H0: mu1 = mu0
HA: mu 1 does not = Mu0

86
Q

one sided null and alternative hypotheses

A
```H0: mu1 >= M0
HA mu1 < Mu0
OR
H0: Mu <= M0
HA Mu1>M0```
87
Q

3 steps to hypothesis testing

A
1. calculate test statistic
2. identify probability distribution of the test statistic
3. calculate p-value from test statistic based on probability distribution
88
Q

How do you reduce type 1 error?

A

select a smaller alpha

89
Q

decrease alpha, sample size ___ and Power ___

A

increases, decreases

90
Q

to detect smaller differences between samples, sample size should be _____ and power should ____

A

increased, decrease

91
Q

4 tests available for continuous outcome/categorical predictor with 2 groups`

A

t-test
Wilcoxon rand sum (NP)
mann-whitney U test
median test

92
Q

2 tests available for continuous outcome/categorical predictor with >2 groups

A

ANOVA

kruskal-wallis (NP)

93
Q

3 test available for paired continuous outcome/categorical predictor

A

paired t-test
Wilcoxson signed rank (NP)
sign test

94
Q

3 tests available for categorical outcome/categorical predictor

A

chi squared
fisher’s exact test
paired–McNemar’s chi squared

95
Q

3 tests available for continuous outcome/continuous predictor

A

Pearson’s
spearman’s (NP)
linear regression

96
Q

Test for categorical outcome/continuous predictor

A

logistic regression

97
Q

When should you use nonparametric tests? (3)

A

to convert values to rank–then analyze rank
with small sample sizes
with ordinal outcomes

98
Q

2 sample t-test use and output

A

use: compare continuous outcome between 2 groups when the data is symmetric or n>15
outcome: t-statistic –> p-value

99
Q

Wilcoxon rank-sum test use and output

A

use: compare continuous outcome between 2 groups when the data is skewed, small n, or ordinal data
output: rank overall –> compare sums of ranks between 2 groups

100
Q

Median test definition

A

overall median across entire sample

asks whether each value is > or < median and compares via a 2x2 table and chi-squared

101
Q

Paired t-test use

A

compare continuous outcomes in pairs

looks at mean difference of pairs then asks is it different y/n by one-sample t test

102
Q

Wilcoxon signed rank use

A

continuous outcomes in pairs when there are few pairs or data is skewed

103
Q

Sign test use

A

continuous outcomes in pairs when you don’t have numbers, only relationships

104
Q

ANOVA use

A

comparing continuous outcomes between >2 groups

105
Q

Kruskall Wallis Use

A

comparing continuous outcomes between >2 groups when you have skewed sample, small n, ordinal data
compares sums of ranks or groups

106
Q

Fisher’s exact test use

A

small sample size for categorical outcome/categorical predictor (any cell <5)

107
Q

McNemar’s chi-squared use

A

chi-sq for matched or paired proportions (ex. matched case-control)

108
Q

r^2 definition

A

the amount of variability accounted for by the line of best fit

109
Q

correlation coefficient equation

A

sq root of r^2

110
Q

r=0.2 is ____ correlation

A

weak

111
Q

r=0.4 is _____ correlation

A

moderate

112
Q

r=0.8 is _____ correlation

A

strong

113
Q

Linear regression use

A

continuous outcome w continuous predictor

114
Q

F test equation

A

MSfitted/MSerror with p-1, n-1 DFs

115
Q

Multicolinearity definition

A

When 2 or more predictor variables are highly correlated

116
Q

Multicolinearity consequences (2)

A

increases standard error of beta estimates

117
Q

ANCOVA use

A

used to compare means between groups while controlling for other variables (covariates) that may be unbalanced between groups

118
Q

logistic regression use

A

categorical outcome/continuous predictor

betas are estimated from maximum likelihood–model gives the probability of the outcome

119
Q

Who determines which diseases are notifiable?

A

Council of Territorial and State Epidemiologists

120
Q

Sensitivity definition

A

the proportion of those that have a diseases that are accurately defined as having it (SNOUT)

121
Q

Specificity definition

A

those without a disease that are accurately identified as NOT having it (SPIN)

122
Q

Multiplication rule equation

A

P(event 1 and event 2) = P(1) x P(2)

123
Q

Multiplication rule use

A

determine the probability of 2 independent events

can also use to test for independence

124
Q

A

P(1 or 2) = P(1) + P(2)

125
Q

Addition rule equation (not mutually exclusive)

A

P(1 or 2) = P(1) + P(2) - P(1 and 2)

126
Q

I^2 Statistic definition

A

total variation in a study estimate due to heterogeneity between studies (for meta-analysis)
If >50% –> heterogenous

127
Q

Kaplan-Meier curve statistical test

A

log rank test

128
Q

Cox proportional hazards test

A

hazard ratios

129
Q

Common source outbreak pattern

A

a group of people become ill after being exposed to a point-source contaminant

130
Q

Continuous common source outbreak pattern

A

a common source continuously affects this who come into contact with them

131
Q

Propagated outbreak pattern

A

infection is transmitted from one person to another

132
Q

Mixed outbreak pattern

A

when a common source outbreak is complicated by person-to-person spread

133
Q

Meta-analysis output for categorical variables

A

OR

134
Q

Meta-analysis output for continuous variables

A

mean differences

135
Q

sensitivity + ______ = 1

A

false negative error rate

136
Q

specificity + _____ = 1

A

false positive error rate

137
Q

ILINet Case Definition (3)

A

fever >100
cough +/- sore throat
if flu swab + ok

138
Q

How does NHANES get its data?

A

home interviews and PEs