Epidemiology & Biostatistics Flashcards Preview

ACVPM prep by Heidi > Epidemiology & Biostatistics > Flashcards

Flashcards in Epidemiology & Biostatistics Deck (29):

Define Bias

Systematic error in the design, management or analysis of a study that causes a mistaken estimate of the exposure’s effect on the outcome.


Explain design bias

Wrongly chosen sampling strategy or study design.


Explain conduct bias

case enrollment, follow-up or data collection is not carried out properly and has issues.


Explain analysis bias

The chosen statistical methods are wrong, variables could be miscategorized or modelling assumptions can be wrong.


What three types of bias are there?

Selection Bias, Confounders and information bias (also called measurement or misclassification bias)


Define confounding

A confounder is a variable that influences both the dependent variable and independent variable causing a spurious association.


Define sensitivity

The proportion of positives that are correctly identified as such. (Also called the true positive rate, the recall, or probability of detection in some fields)


Define spesificity

The proportion of negatives that are correctly identified as such. (Also called the true negative rate)


What are the four main aspects of infection control?

Surveillance (passive/active), patient contact, hygiene, education/awareness


How does Normal distribution fit with standard deviation?

68% are within 1 SD of mean and 95.5% are in 2 SD's and 2.3% in each tail .


Range of P-value and the usual significance level

range from 0-1, significance commonly 0,05


What is Type II error?

Thinking there is no difference when there in truth is difference, ie. the failure to reject null hypothesis.


What is Type I error?

Thinking that there is a difference, when in fact there is none, ie. the failure to accept true null hypothesis.


Explain power and what affects it

The ability for a test to find a difference when there really is difference, ie a true positive. Power is high if the outcome difference is large, when significance level is high, sampling variability is low and sample size is large


What can linear regression be used for? How to test for significance of the test?

explore the linear relationship between two continuous random variables with normal distribution and equal variance. Use p-value of the slope for significance and R-squared (between 0-1, percentages) to how well it fit's the data.


Uses of logistic regression and type of curve, results and significance?

To model the log of the odds of the binary outcome we are interested in as a linear function of one or more predictors, X. Sigmoid curve. Results are either coefficients (beetas) or odds ratios ratios calculated from them (exp(coefficient) and confidence intervals or p-values.


What tests can you use for testing or modeling means of continuous data?

t-test, Anova and linear regression


What should you know to estimate correct sample size,?

Significance level (alpha=0,05, remember type I versus type II trade off - the lover level, the higher sample), Power (1-beta, sensitivity), Variance (ie how precise are your measurements), Effect size (smaller effect, larger sample)


Ten Steps of an Outbreak Investigation

1. Determine the existence of the outbreak
2. Confirm the diagnosis
3. Define a case and count cases
4. Orient the data in terms of time, place, and person
5. Determine who is at risk of becoming ill
6. Develop a hypothesis that explains the exposure that caused disease and test this hypothesis
7. Compare the hypothesis with the established facts
8. Plan a more systematic study
9. Prepare a written report
10. Execute control and prevention measures


For example, suppose we want to estimate the survival of premature infants that are born at 25 weeks of gestation and we create a CI that ranges from 64.3% to 89.5%. How can we interpret this value?

“We are 95% sure that this interval 64.3% to 89.5% contains the overall proportion of surviving infants in the population.” Or another way, “We are 95% sure that the true proportion of survival for infants born at 25 weeks of gestation is between 64.3% and 89.5%”


Describe nominal data

No natural order - gender, race, blood type


Describe ordinal data

Natural order - tumor scales, social class


Describe binary data

0-1, disease status, diagnostic test result


Describe categorical data and test to use with it

Either nominal, ordinal or binary. Summarise with counts and proportions, plot with pies and bars, analyse with confidence intervals, chi-square, mcNemar(paired), logistic regression (multiple


What classes fall into quantitative data

Discrete meaning integers and continuous which can take any real number


What methods can you use for discrete count data?

rates for summary, trend plots and histograms for plots and confidence intervals and poisson regression


What methods can you use for continuous data?

Summarise: mean, SD, median, interquartile range.
Plot: histogram, scatter plot, box plot, dot plot
Analyse: confidence interval, T-test, anova, correlation, simple and multiple linear regression


Time-to-Event methods?

Summary: median survival time, five-year survival, hazard ratio (HR)
Plot: Kaplan-meier curves
Analyse: confidence interval of HR, Long-rank test, proportional hazard regression (PHR = cox regression)


Factors that affect estimate precision

variability of outcome, sample size, desired confidence level