DL2: Quiz 1 Flashcards

(90 cards)

1
Q

Define sensitivity?

A

TP/All+
The proportion of pt with dz who test + over all +

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define specificity?

A

TN/All-
The proportion of pt without dz who test - over all -

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define PPV?

A

Probability that people who test positive have the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define NPV?

A

Probability that people who test negative do not have the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define population?

A

All possible subjects of interest to the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define sample?

A

A subset of the population the is to represent the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define statistic?

A

A number that represents a property of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define ratio?

A

One number divided by another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define proportion?

A

ratio (a part divided by the whole)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define probability?

A

The chance of an event occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define risk?

A

Probability of an event occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define rate?

A

Proportion with a time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define incidence?

A

new cases that occurred/population at risk

Proportion of people who develop a condition during a time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define prevalence?

A

new cases that occurred/population at risk

Proportion of people who have a condition at one interval of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Qualitative data?

A

Categorical
Nominal: pertaining to names
Ordinal: categories have an order or rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Quantitative data?

A

Continuous
Interval: No absolute zeros (addition and subtraction)
Ratio: has absolute zero, no negative numbers (multiply and divide)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Independent variable?

A

The one we can manipulate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Dependent variable?

A

The one we measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Covariants/Cofounder?

A

Any variable other than the chosen independent variable the may affect the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Mean?

A

Sum of all observation/number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Median?

A

Middle number when observations are placed in numerical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Mode?

A

Most frequent observationz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Range?

A

Highest value minus lowest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Variance?

A

Subtract the mean from each measurement and square the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Standard dev?
The square root of the variance
26
A: Lowest observation B: lower quartile C: Median D: Upper quartile E: Highest observation
27
Descriptive stats?
Organizes and summarizes data (skewness, mean, median, mode, standard dev, scatter plots)
28
Inferential stats?
Estimate population parameters, and how confident we can be in our conclusions
29
Simple randoming
Probability sampling Every subject has equal probability of being selected
30
Systemic random?
Probability sampling Select every nth subject Randomly selects subjects with known sampling strategies
31
Stratified sampling?
Probability sampling Divide population into relevant strata and take random samples from each stratum
32
Cluster sampling?
Probability sampling Divide population into cluster and randomly select a subset from each cluster
33
Convenience sampling?
Non-Probability sampling Select subjects based on availability, not representative of population
34
Volunteer sampling?
Non-Probability sampling Take all subjects who volunteer
35
Why is probability better than non-probability sampling?
Not based on probability and susceptible to selection bias
36
Stratified vs cluster sampling
Stratified: 1. Partition population into mutually exclusive homogenous groups based on factor that may influence the measured variable 2. Obtain a simple random sample from each group 3. Collect data on each subject the was randomly sampled from each group 4. Heterogenous is split into homogenous sub pops (starts collection is exhaustive) Cluster: 1. Divide population into groups 2. Obtain a simple random sample of clusters 3. Collect data on every subject in each of the randomly selected clusters (heterogeneous) 4. Useful when target of an intervention is a system rather than individual
37
What type of distribution?
Normal
38
What type of distribution?
Binomal
39
Poisson distribution?
Discrete, quantitative data that occurs independently and randomly in time at some constant mean rate. Primarily used to estimate the probability of rare events and predict the number of times an event occurs Give probability that an outcome will occur a specified number of times when the number of trials is large and probability of an occurrence is small Ex: Used to calculate number of deaths from lung cancer in a year in a town. Info is used to compare observed and expected values to decide if the number of deaths from cancer is higher or lower than expected
40
What type of distribution?
Poisson distribution
41
What causes skewness?
Outliers
42
Kurtosis?
A measure of the combined weight of the tails relative to the rest of the distribution
43
Mean Median Mode
44
What is the purpose of data transformation?
To change skewed or unknown distributions to a normal distribution in order to calculate p-value
45
What is central limit theorem?
When equally sized samples are drawn from a non-normal distribution, the plotted mean from each sample will approximate a normal distribution as long as the non-normality was not due to outliers Sufficiently large sample is generally considered 30 or more
46
What is p-value?
The probability of obtaining a measurement as extreme as the one obtained, assuming the null hypothesis is true.
47
What is null-hypothesis?
A hypothesis that states that there is no significant difference between 2 sets of data.
48
Type 1 error?
Rejecting the null hypothesis when the null hypothesis is true False positive
49
Type 2 error?
Accepting the null hypothesis when the null hypothesis is false False negative
50
What is 𝛂?
Critical value for rejecting the null hypothesis (0-1)
51
When would you reject the null?
P<𝛂 - a small p-value (i.e., less than alpha) is an "unlikely" result to obtain, allowing us to reject the null hypothesis (i.e., we see a statistically significant difference in the two groups). - a large p-value (i.e., larger than alpha) is a "likely" result to obtain, allowing us to accept the null hypothesis (i.e., we will not see a statistically significant difference in the two groups).
52
What is ß?
Probability of a type II error (FN)
53
What type of graph? What does it do?
Histogram Presents data as frequency counts over some interval
54
What type of graph? What are its components?
Boxplot 1. Thin lined box indicates the IQR – the 25th to the 75th percentiles of the data. 2. Within the thin lined box is the bolded line – the median. 3. From both ends of the thin lined box is the tail (or whiskers) which shows the minimum and maximum points up to 1.5 IQRs beyond the median. 4. The circle is an outlier, defined as data between 1.5 to 3.0 IQRs beyond the median. 5. The asterisk is an extreme outlier, defined as data points beyond 3.0 IQRs beyond the median.
55
What type of graph? What are its components?
Scatterplot Presents data from 2 variables both measured on a continuous scale Useful for accessing the association between 2 variables and assessing assumptions of tests such as linearity and absence of outliers
56
Confidence interval?
Range of values in which we have some level of confidence the true population value will lie Smaller CI means less variability 95% CI is same as 5% alpha Narrow CI: little variation and more precise Wide CI: Greater variation and less precise
57
What does overlap of CI box plots mean?
Directly related to p-value less overlap = larger difference and lower p-value p<
58
Calculate risk ratio?
Risk in people with risk factor/risk in people w/o risk factor RR = (a/(a+b)) / (c/(c+d))
59
Calculate absolute reduction or increase?
ARR EER-CER Risk of experimental-risk of control
60
Calculate relative risk reduction?
RRR (Risk of experimental-risk of control)/ risk of control (EER-CER)/CER
61
Calculate number needed to treat?
NNT 1/ARR (absolute risk reduction)
62
Calculate number needed to harm?
NNH 1/ARI (Absolute risk increase)
63
Calculate odds of risk factor in cases (with event)?
a/c
64
Calculate odds of risk factor in control (no event)?
b/d
65
Calculate odds ratio?
(a/c)/(b/d) = ad/bc Ratio of the odds of an exposure in the case group to the odds of an exposure in the control group
66
Cohort studies?
Observes development of disease in exposed and unexposed groups
67
Case control studies?
Select subjects with event, compare presence of risk factor in cases with event to controls with out event
68
CI interpretation?
1. RR CI contains 1: no difference in risk. Do not reject H0. 2. RR entire CI > 1: risk in intervention group > risk in control group. 3. RR entire CI < 1: risk in intervention group < risk in control group.
69
OR interpretations?
1. OR CI contains 1: no difference in odds. Do not reject H0. 2. OR entire CI > 1: Odds in Case(or event) group > odds in control group. Reject H0 3. OR entire CI < 1: Odds in Case (or event) group < odds in control group. Reject H0
70
______________ tests make assumptions about the parameters of the population distribution from which the sample data is taken.
PArametric
71
______________ tests do NOT make assumptions about data distribution. However, they do require groups to have approximately the same dispersion.
Non-parametric
72
When should non-parametric test be used?
1. Data don't seem to follow distribution 2. Assumptions underlying parametric tests are not met 3. Sata appear to be very skewed 4. Data has significant outliers
73
Types of parametric tests?
1. Paired t-test 2. Unpaired t-test 3. Pearson correlation 4. One way ANOVA
74
Types of non-parametric equivalent?
1. Wilcoxon Rank sum test 2. Mann-whitney u test 3. spearman correlation 4. Kruskal Wallis test
75
What is paired variable?
Compare for 2 different variables for same group
76
What is dependent variables?
Compare outcomes on the same variable fro 2 different groups
77
Which is which?
a: one tailed (5%) b: 2 tailed (2.5%)
78
What are t-tests? Types?
Test for differences between means, larger the stat the tmore difference between the groups Independent sample: compares means of 2 groups Paired: compares means from same group at different times One sample: compares the mean of one group to known mean
79
Degrees of freedom?
A measure of the amount of independent data that can be used to estimate a parameter The probability distributions of the test statistics of hypothesis tests Number of data points which are free to vary
80
What is degrees of freedom dependent on?
1 Number of groups compared 2. Number of parameters needed to estimate the standard deviation
81
How do you test for independence?
1. Random samples 2. Categorical data (counts) 3. Non-Parametric 4. Tests whether a categorical variable is related to another
82
How do you test for goodness of fit?
1. Random samples 2. Categorical data (counts) 3. Non-Parametric 4. Tests whether data is representative of the full population. 5. Compares observed data to a theoretical model
83
What does chi square observe?
frequency with expected frequency
84
What is survival analysis?
Branch of stats for analyzing the expected duration of time until an event occurs Must deal with censored data
85
What are the causes of censored data?
1. event doesn't occur during study period 2. subject lost to follow up 3. subject dies from something other than studied cause
86
What is kaplan-meier analysis? Assumptions?
Non-Parametric survival analysis method – no assumptions about how event probability changes over time. 1. Censoring is independent of event probability 2. Survival probabilities are comparable in early and later recruited subjets 3. Censoring is not more likely in one group than another
87
What is hazard rate?
The relative risk of complications based on comparison of event rates.
88
What is intention to treat?
Every patient randomized enters the primary analysis
89
What is per-protocol?
Analysis includes only those patients who strictly adhered to the protocol Identifies effect under ideal conditions
90
When would you use a forest plot?
Key way data from multiple papers is summarized in a single image