1. Confounding 2. Selection Bias 3. Information Bias

Epi/Biostats Flashcards by jenn okeeffe

Beta

Regression Coefficient, expected (average) change in Y when X (explanatory variable) changes by one unit and the other explanatory variables stay the same

How well did you know this?

Not at all

Perfectly

Wald Statistic

Test whether regression coefficient of a variable is zero

Beta squared over var(B)
p-value = P(chi-squared > Wald statistic)

If p is small, the variable associated with the regression coefficient is important (statistically significant)

How well did you know this?

Not at all

Perfectly

Likelihood Ratio Test

Test to compare two models: one with q (“null”), the other with p variables with p>q (nested models)

If p < 0.05, the group of p-q variables in the extended model is important (statistically significant)

How well did you know this?

Not at all

Perfectly

Type I error

Probability of receiving a significance result (rejecting the null) when it is not true - False positive

How well did you know this?

Not at all

Perfectly

Type II error

Probability of failing to reject the null when it should be rejected - false negative

How well did you know this?

Not at all

Perfectly

Sensitivity

Probability of positive test given case (A / A+C)

How well did you know this?

Not at all

Perfectly

Specificity

Probability of a negative test given a non-case (B/B+D)

How well did you know this?

Not at all

Perfectly

PPV

Probability that the case is actually a case given that it tested positive (A/A + B)

How well did you know this?

Not at all

Perfectly

NPV

Negative Predictive Value probability that a non-case is true given a negative test result (D/C+D)

How well did you know this?

Not at all

Perfectly

Residual variance in regression equation

Error term

How well did you know this?

Not at all

Perfectly

Types of bias

Confounding
Selection Bias
Information Bias

How well did you know this?

Not at all

Perfectly

Propensity Score

Probability of a unit being assigned to a particular treatment or exposure given an observed set of covariates.

Used to reduce selection bias by equating groups on these covariates

How well did you know this?

Not at all

Perfectly

When to use log-binomial

When risk or prevalence is >10% risk odds ratio and prevalence odds ratio will overestimate the prevalence ratio so need to use log-binomial to directly estimate the prevalence ratio or risk ratio

How well did you know this?

Not at all

Perfectly

Risk vs odds

Risk = probability of occurrence of an event or outcome
Odds = probability of occurrence of an event or outcome / probability non-occurrence of the event or out come

How well did you know this?

Not at all

Perfectly

P-value

Probability of obtaining results as extreme as those observed under the null hypothesis. Protects from type I error or false positives, which lead us to conclude there is an association that isn’t really there.

How well did you know this?

Not at all

Perfectly

ICC

Intraclass correlation coefficient- the degree to which the variance of the cluster explains the variance of the whole. The between individual variance / the total variance

How well did you know this?

Not at all

Perfectly

Vaccine effectiveness formula

(1 - adjusted OR) x 100%

How well did you know this?

Not at all

Perfectly

Risk ratio formula

(a / (a+b)) / (c / (c+d))

How well did you know this?

Not at all

Perfectly

Residuals

Study These Flashcards

The difference between the observed outcome and the mean in each group

Kappa statistic

Study These Flashcards

Determines percent of the inter-rater reliability agreement beyond what would be expected by chance

Po - Pc / 1 - Pc

> .8 is an almost perfect level of agreement beyond chance

Interaction coefficient

Study These Flashcards

Measures how much an association between Y and one predictor (X1) differs across levels of another predictor (X2)

Marginal

Study These Flashcards

Does not include other covariates in the model

Structural model

Study These Flashcards

Model for counterfactual outcome

Wilcoxon rank sum vs t-test

Study These Flashcards

WRS compares medians. T-test compares means. WRS more appropriate for data with outliers.

Conditional / random effects model

Include other covariates

Frequentist

Parameters are the truth

Bayesian

Parameters have a distribution

Covariance

Measure of joint probability of two random variables. If both variables are high at the same time covariance is positive. If one is high when the other is low, covariance is negative The sign of covariance thus shows the tendency of the linear relationship

Correlation Coefficient

Normalized version of covariance. Shows the magnitude and thus the strength of the linear relationship

Survey Weight

Value assigned to each case to indicate how much each case will count in a statistical procedure

Problems with survey weights

Almost always increase standard errors

Design Effect

Variance from survey / Variance estimate with SRS

AIC

Chooses the best model from a set using : -2(log-likelihood) + 2K

Variance

Average of the squared differences of observations from the sample mean

Normalize distribution

Subtract the mean from each value and divide by the sd (Z = X - u/o)

Central Limit Theorem

If you have a population with mean mu and standard deviation sigma and take a sufficiently large sample then the distribution of the means will be approximately normally distributed

Probability that a population parameter will fall between a set of values for a certain proportion of times

F-statistic

Test if sample variances are equal. P<0.05 means they are not equal

Horvitz-Thompson estimator

Inverse probability weighting applied to samples to account for differences between the sample and target population

Log-linear model

Betas are the derivative of the log of expected y | derivative x -> (log(E(y|x)) Parameters are linear but the data isn’t B*100 Measures the percentage change in y when x increases by one unit keeping other variables constant

Maximum Likelihood Estimation

MLE is a method that will find values of mean, u, and sd, o, that result in the curve that best fit the data

Bayesian Inference

The process of deducing properties about a population or probability distribution from data using Bayes theorem: P(A|B) = P(B|A)*P(A)/ P(B)

Poisson Offset

Population or person-time

ANCOVA

Analysis of Covariance - used to test for interaction or effect measure modification -> whether means of a dependent variable are equal across levels of a categorical independent variable Generate an interaction term for the model Test if the interaction term = 0 If the interaction term is not significant, reduce to MLR/simpler model

Correlation vs Covariance

Two terms that are opposed but related. Correlation shows how two variables are related, covariance shows how two variables differ

Attributes of Surveillance System

Simplicity Flexibility Acceptability Sensitivity Positive Predictive Value Representativeness Timeliness

Sources of measurement error in surveys

1. The tool 2. The method of data collection 3. The interviewer 4. The respondent

Epi/Biostats Flashcards

(47 cards)