DL2: Quiz 1 Flashcards

Question

Standard dev?

Answer 1

The square root of the variance

Answer 2

A: Lowest observation B: lower quartile C: Median D: Upper quartile E: Highest observation

Answer 3

Organizes and summarizes data (skewness, mean, median, mode, standard dev, scatter plots)

Answer 4

Estimate population parameters, and how confident we can be in our conclusions

Answer 5

Probability sampling Every subject has equal probability of being selected

Answer 6

Probability sampling Select every nth subject Randomly selects subjects with known sampling strategies

Answer 7

Probability sampling Divide population into relevant strata and take random samples from each stratum

Answer 8

Probability sampling Divide population into cluster and randomly select a subset from each cluster

Answer 9

Non-Probability sampling Select subjects based on availability, not representative of population

Answer 10

Non-Probability sampling Take all subjects who volunteer

Answer 11

Not based on probability and susceptible to selection bias

Answer 12

Stratified: 1. Partition population into mutually exclusive homogenous groups based on factor that may influence the measured variable 2. Obtain a simple random sample from each group 3. Collect data on each subject the was randomly sampled from each group 4. Heterogenous is split into homogenous sub pops (starts collection is exhaustive) Cluster: 1. Divide population into groups 2. Obtain a simple random sample of clusters 3. Collect data on every subject in each of the randomly selected clusters (heterogeneous) 4. Useful when target of an intervention is a system rather than individual

Answer 13

Discrete, quantitative data that occurs independently and randomly in time at some constant mean rate. Primarily used to estimate the probability of rare events and predict the number of times an event occurs Give probability that an outcome will occur a specified number of times when the number of trials is large and probability of an occurrence is small Ex: Used to calculate number of deaths from lung cancer in a year in a town. Info is used to compare observed and expected values to decide if the number of deaths from cancer is higher or lower than expected

Answer 14

Poisson distribution

Answer 15

A measure of the combined weight of the tails relative to the rest of the distribution

Answer 16

Mean Median Mode

Answer 17

To change skewed or unknown distributions to a normal distribution in order to calculate p-value

Answer 18

When equally sized samples are drawn from a non-normal distribution, the plotted mean from each sample will approximate a normal distribution as long as the non-normality was not due to outliers Sufficiently large sample is generally considered 30 or more

Answer 19

The probability of obtaining a measurement as extreme as the one obtained, assuming the null hypothesis is true.

Answer 20

A hypothesis that states that there is no significant difference between 2 sets of data.

Answer 21

Rejecting the null hypothesis when the null hypothesis is true False positive

Answer 22

Accepting the null hypothesis when the null hypothesis is false False negative

Answer 23

Critical value for rejecting the null hypothesis (0-1)

Answer 24

P<𝛂 - a small p-value (i.e., less than alpha) is an "unlikely" result to obtain, allowing us to reject the null hypothesis (i.e., we see a statistically significant difference in the two groups). - a large p-value (i.e., larger than alpha) is a "likely" result to obtain, allowing us to accept the null hypothesis (i.e., we will not see a statistically significant difference in the two groups).

Answer 25

Probability of a type II error (FN)

Answer 26

Histogram Presents data as frequency counts over some interval

Answer 27

Boxplot 1. Thin lined box indicates the IQR – the 25th to the 75th percentiles of the data. 2. Within the thin lined box is the bolded line – the median. 3. From both ends of the thin lined box is the tail (or whiskers) which shows the minimum and maximum points up to 1.5 IQRs beyond the median. 4. The circle is an outlier, defined as data between 1.5 to 3.0 IQRs beyond the median. 5. The asterisk is an extreme outlier, defined as data points beyond 3.0 IQRs beyond the median.

Answer 28

Scatterplot Presents data from 2 variables both measured on a continuous scale Useful for accessing the association between 2 variables and assessing assumptions of tests such as linearity and absence of outliers

Answer 29

Range of values in which we have some level of confidence the true population value will lie Smaller CI means less variability 95% CI is same as 5% alpha Narrow CI: little variation and more precise Wide CI: Greater variation and less precise

Answer 30

Directly related to p-value less overlap = larger difference and lower p-value p<

Answer 31

Risk in people with risk factor/risk in people w/o risk factor RR = (a/(a+b)) / (c/(c+d))

Answer 32

ARR EER-CER Risk of experimental-risk of control

Answer 33

RRR (Risk of experimental-risk of control)/ risk of control (EER-CER)/CER

Answer 34

NNT 1/ARR (absolute risk reduction)

Answer 35

NNH 1/ARI (Absolute risk increase)

Answer 36

(a/c)/(b/d) = ad/bc Ratio of the odds of an exposure in the case group to the odds of an exposure in the control group

Answer 37

Observes development of disease in exposed and unexposed groups

Answer 38

Select subjects with event, compare presence of risk factor in cases with event to controls with out event

Answer 39

1. RR CI contains 1: no difference in risk. Do not reject H0. 2. RR entire CI > 1: risk in intervention group > risk in control group. 3. RR entire CI < 1: risk in intervention group < risk in control group.

Answer 40

1. OR CI contains 1: no difference in odds. Do not reject H0. 2. OR entire CI > 1: Odds in Case(or event) group > odds in control group. Reject H0 3. OR entire CI < 1: Odds in Case (or event) group < odds in control group. Reject H0

Answer 41

PArametric

Answer 42

Non-parametric

Answer 43

1. Data don't seem to follow distribution 2. Assumptions underlying parametric tests are not met 3. Sata appear to be very skewed 4. Data has significant outliers

Answer 44

1. Paired t-test 2. Unpaired t-test 3. Pearson correlation 4. One way ANOVA

Answer 45

1. Wilcoxon Rank sum test 2. Mann-whitney u test 3. spearman correlation 4. Kruskal Wallis test

Answer 46

Compare for 2 different variables for same group

Answer 47

Compare outcomes on the same variable fro 2 different groups

Answer 48

a: one tailed (5%) b: 2 tailed (2.5%)

Answer 49

Test for differences between means, larger the stat the tmore difference between the groups Independent sample: compares means of 2 groups Paired: compares means from same group at different times One sample: compares the mean of one group to known mean

Answer 50

A measure of the amount of independent data that can be used to estimate a parameter The probability distributions of the test statistics of hypothesis tests Number of data points which are free to vary

Answer 51

1 Number of groups compared 2. Number of parameters needed to estimate the standard deviation

Answer 52

1. Random samples 2. Categorical data (counts) 3. Non-Parametric 4. Tests whether a categorical variable is related to another

Answer 53

1. Random samples 2. Categorical data (counts) 3. Non-Parametric 4. Tests whether data is representative of the full population. 5. Compares observed data to a theoretical model

Answer 54

frequency with expected frequency

Answer 55

Branch of stats for analyzing the expected duration of time until an event occurs Must deal with censored data

Answer 56

1. event doesn't occur during study period 2. subject lost to follow up 3. subject dies from something other than studied cause

Answer 57

Non-Parametric survival analysis method – no assumptions about how event probability changes over time. 1. Censoring is independent of event probability 2. Survival probabilities are comparable in early and later recruited subjets 3. Censoring is not more likely in one group than another

Answer 58

The relative risk of complications based on comparison of event rates.

Answer 59

Every patient randomized enters the primary analysis

Answer 60

Analysis includes only those patients who strictly adhered to the protocol Identifies effect under ideal conditions

Answer 61

Key way data from multiple papers is summarized in a single image

DL2: Quiz 1 Flashcards

(90 cards)