Flashcards in Epidemiology & Biostatistics Deck (29):

1

## Define Bias

### Systematic error in the design, management or analysis of a study that causes a mistaken estimate of the exposure’s effect on the outcome.

2

## Explain design bias

### Wrongly chosen sampling strategy or study design.

3

## Explain conduct bias

### case enrollment, follow-up or data collection is not carried out properly and has issues.

4

## Explain analysis bias

### The chosen statistical methods are wrong, variables could be miscategorized or modelling assumptions can be wrong.

5

## What three types of bias are there?

### Selection Bias, Confounders and information bias (also called measurement or misclassification bias)

6

## Define confounding

### A confounder is a variable that influences both the dependent variable and independent variable causing a spurious association.

7

## Define sensitivity

### The proportion of positives that are correctly identified as such. (Also called the true positive rate, the recall, or probability of detection in some fields)

8

## Define spesificity

### The proportion of negatives that are correctly identified as such. (Also called the true negative rate)

9

## What are the four main aspects of infection control?

### Surveillance (passive/active), patient contact, hygiene, education/awareness

10

## How does Normal distribution fit with standard deviation?

### 68% are within 1 SD of mean and 95.5% are in 2 SD's and 2.3% in each tail .

11

## Range of P-value and the usual significance level

### range from 0-1, significance commonly 0,05

12

## What is Type II error?

### Thinking there is no difference when there in truth is difference, ie. the failure to reject null hypothesis.

13

## What is Type I error?

### Thinking that there is a difference, when in fact there is none, ie. the failure to accept true null hypothesis.

14

## Explain power and what affects it

### The ability for a test to find a difference when there really is difference, ie a true positive. Power is high if the outcome difference is large, when significance level is high, sampling variability is low and sample size is large

15

## What can linear regression be used for? How to test for significance of the test?

### explore the linear relationship between two continuous random variables with normal distribution and equal variance. Use p-value of the slope for significance and R-squared (between 0-1, percentages) to how well it fit's the data.

16

## Uses of logistic regression and type of curve, results and significance?

### To model the log of the odds of the binary outcome we are interested in as a linear function of one or more predictors, X. Sigmoid curve. Results are either coefficients (beetas) or odds ratios ratios calculated from them (exp(coefficient) and confidence intervals or p-values.

17

## What tests can you use for testing or modeling means of continuous data?

### t-test, Anova and linear regression

18

## What should you know to estimate correct sample size,?

### Significance level (alpha=0,05, remember type I versus type II trade off - the lover level, the higher sample), Power (1-beta, sensitivity), Variance (ie how precise are your measurements), Effect size (smaller effect, larger sample)

19

## Ten Steps of an Outbreak Investigation

###
1. Determine the existence of the outbreak

2. Confirm the diagnosis

3. Define a case and count cases

4. Orient the data in terms of time, place, and person

5. Determine who is at risk of becoming ill

6. Develop a hypothesis that explains the exposure that caused disease and test this hypothesis

7. Compare the hypothesis with the established facts

8. Plan a more systematic study

9. Prepare a written report

10. Execute control and prevention measures

20

## For example, suppose we want to estimate the survival of premature infants that are born at 25 weeks of gestation and we create a CI that ranges from 64.3% to 89.5%. How can we interpret this value?

### “We are 95% sure that this interval 64.3% to 89.5% contains the overall proportion of surviving infants in the population.” Or another way, “We are 95% sure that the true proportion of survival for infants born at 25 weeks of gestation is between 64.3% and 89.5%”

21

## Describe nominal data

### No natural order - gender, race, blood type

22

## Describe ordinal data

### Natural order - tumor scales, social class

23

## Describe binary data

### 0-1, disease status, diagnostic test result

24

## Describe categorical data and test to use with it

### Either nominal, ordinal or binary. Summarise with counts and proportions, plot with pies and bars, analyse with confidence intervals, chi-square, mcNemar(paired), logistic regression (multiple

25

## What classes fall into quantitative data

### Discrete meaning integers and continuous which can take any real number

26

## What methods can you use for discrete count data?

### rates for summary, trend plots and histograms for plots and confidence intervals and poisson regression

27

## What methods can you use for continuous data?

###
Summarise: mean, SD, median, interquartile range.

Plot: histogram, scatter plot, box plot, dot plot

Analyse: confidence interval, T-test, anova, correlation, simple and multiple linear regression

28

## Time-to-Event methods?

###
Summary: median survival time, five-year survival, hazard ratio (HR)

Plot: Kaplan-meier curves

Analyse: confidence interval of HR, Long-rank test, proportional hazard regression (PHR = cox regression)

29