DL2: Quiz 1 Flashcards

1
Q

Define sensitivity?

A

TP/All+
The proportion of pt with dz who test + over all +

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define specificity?

A

TN/All-
The proportion of pt without dz who test - over all -

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define PPV?

A

Probability that people who test positive have the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define NPV?

A

Probability that people who test negative do not have the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define population?

A

All possible subjects of interest to the study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define sample?

A

A subset of the population the is to represent the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define statistic?

A

A number that represents a property of the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define ratio?

A

One number divided by another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define proportion?

A

ratio (a part divided by the whole)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define probability?

A

The chance of an event occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define risk?

A

Probability of an event occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define rate?

A

Proportion with a time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define incidence?

A

new cases that occurred/population at risk

Proportion of people who develop a condition during a time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define prevalence?

A

new cases that occurred/population at risk

Proportion of people who have a condition at one interval of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Qualitative data?

A

Categorical
Nominal: pertaining to names
Ordinal: categories have an order or rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Quantitative data?

A

Continuous
Interval: No absolute zeros (addition and subtraction)
Ratio: has absolute zero, no negative numbers (multiply and divide)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Independent variable?

A

The one we can manipulate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Dependent variable?

A

The one we measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Covariants/Cofounder?

A

Any variable other than the chosen independent variable the may affect the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Mean?

A

Sum of all observation/number of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Median?

A

Middle number when observations are placed in numerical order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Mode?

A

Most frequent observationz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Range?

A

Highest value minus lowest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Variance?

A

Subtract the mean from each measurement and square the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Standard dev?

A

The square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q
A

A: Lowest observation
B: lower quartile
C: Median
D: Upper quartile
E: Highest observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Descriptive stats?

A

Organizes and summarizes data (skewness, mean, median, mode, standard dev, scatter plots)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Inferential stats?

A

Estimate population parameters, and how confident we can be in our conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Simple randoming

A

Probability sampling

Every subject has equal probability of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Systemic random?

A

Probability sampling

Select every nth subject
Randomly selects subjects with known sampling strategies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Stratified sampling?

A

Probability sampling

Divide population into relevant strata and take random samples from each stratum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Cluster sampling?

A

Probability sampling

Divide population into cluster and randomly select a subset from each cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Convenience sampling?

A

Non-Probability sampling

Select subjects based on availability, not representative of population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Volunteer sampling?

A

Non-Probability sampling

Take all subjects who volunteer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Why is probability better than non-probability sampling?

A

Not based on probability and susceptible to selection bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Stratified vs cluster sampling

A

Stratified:
1. Partition population into mutually exclusive homogenous groups based on factor that may influence the measured variable
2. Obtain a simple random sample from each group
3. Collect data on each subject the was randomly sampled from each group
4. Heterogenous is split into homogenous sub pops (starts collection is exhaustive)

Cluster:
1. Divide population into groups
2. Obtain a simple random sample of clusters
3. Collect data on every subject in each of the randomly selected clusters (heterogeneous)
4. Useful when target of an intervention is a system rather than individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What type of distribution?

A

Normal

38
Q

What type of distribution?

A

Binomal

39
Q

Poisson distribution?

A

Discrete, quantitative data that occurs independently and randomly in time at some constant mean rate.

Primarily used to estimate the probability of rare events and predict the number of times an event occurs

Give probability that an outcome will occur a specified number of times when the number of trials is large and probability of an occurrence is small

Ex: Used to calculate number of deaths from lung cancer in a year in a town. Info is used to compare observed and expected values to decide if the number of deaths from cancer is higher or lower than expected

40
Q

What type of distribution?

A

Poisson distribution

41
Q

What causes skewness?

A

Outliers

42
Q

Kurtosis?

A

A measure of the combined weight of the tails relative to the rest of the distribution

43
Q
A

Mean
Median
Mode

44
Q

What is the purpose of data transformation?

A

To change skewed or unknown distributions to a normal distribution in order to calculate p-value

45
Q

What is central limit theorem?

A

When equally sized samples are drawn from a non-normal distribution, the plotted mean from each sample will approximate a normal distribution as long as the non-normality was not due to outliers

Sufficiently large sample is generally considered 30 or more

46
Q

What is p-value?

A

The probability of obtaining a measurement as extreme as the one obtained, assuming the null hypothesis is true.

47
Q

What is null-hypothesis?

A

A hypothesis that states that there is no significant difference between 2 sets of data.

48
Q

Type 1 error?

A

Rejecting the null hypothesis when the null hypothesis is true

False positive

49
Q

Type 2 error?

A

Accepting the null hypothesis when the null hypothesis is false

False negative

50
Q

What is 𝛂?

A

Critical value for rejecting the null hypothesis (0-1)

51
Q

When would you reject the null?

A

P<𝛂
- a small p-value (i.e., less than alpha) is an “unlikely” result to obtain, allowing us to reject the null hypothesis (i.e., we see a statistically significant difference in the two groups).
- a large p-value (i.e., larger than alpha) is a “likely” result to obtain, allowing us to accept the null hypothesis (i.e., we will not see a statistically significant difference in the two groups).

52
Q

What is ß?

A

Probability of a type II error (FN)

53
Q

What type of graph? What does it do?

A

Histogram

Presents data as frequency counts over some interval

54
Q

What type of graph? What are its components?

A

Boxplot
1. Thin lined box indicates the IQR – the 25th to the 75th percentiles of the data.
2. Within the thin lined box is the bolded line – the median.
3. From both ends of the thin lined box is the tail (or whiskers) which shows the minimum and maximum points up to 1.5 IQRs beyond the median.
4. The circle is an outlier, defined as data between 1.5 to 3.0 IQRs beyond the median.
5. The asterisk is an extreme outlier, defined as data points beyond 3.0 IQRs beyond the median.

55
Q

What type of graph? What are its components?

A

Scatterplot
Presents data from 2 variables both measured on a continuous scale

Useful for accessing the association between 2 variables and assessing assumptions of tests such as linearity and absence of outliers

56
Q

Confidence interval?

A

Range of values in which we have some level of confidence the true population value will lie

Smaller CI means less variability

95% CI is same as 5% alpha

Narrow CI: little variation and more precise
Wide CI: Greater variation and less precise

57
Q

What does overlap of CI box plots mean?

A

Directly related to p-value

less overlap = larger difference and lower p-value
p«alpha = reject null and statistically significant

More overlap= smaller significant and higher p value over alfa = accept null and no statistical significance

58
Q

Calculate risk ratio?

A

Risk in people with risk factor/risk in people w/o risk factor

RR = (a/(a+b)) / (c/(c+d))

59
Q

Calculate absolute reduction or increase?

A

ARR

EER-CER

Risk of experimental-risk of control

60
Q

Calculate relative risk reduction?

A

RRR
(Risk of experimental-risk of control)/ risk of control

(EER-CER)/CER

61
Q

Calculate number needed to treat?

A

NNT
1/ARR (absolute risk reduction)

62
Q

Calculate number needed to harm?

A

NNH
1/ARI (Absolute risk increase)

63
Q

Calculate odds of risk factor in cases (with event)?

A

a/c

64
Q

Calculate odds of risk factor in control (no event)?

A

b/d

65
Q

Calculate odds ratio?

A

(a/c)/(b/d) = ad/bc

Ratio of the odds of an exposure in the case group to the odds of an exposure in the control group

66
Q

Cohort studies?

A

Observes development of disease in exposed and unexposed groups

67
Q

Case control studies?

A

Select subjects with event, compare presence of risk factor in cases with event to controls with out event

68
Q

CI interpretation?

A
  1. RR CI contains 1: no difference in risk. Do not reject H0.
  2. RR entire CI > 1: risk in intervention group > risk in control group.
  3. RR entire CI < 1: risk in intervention group < risk in control group.
69
Q

OR interpretations?

A
  1. OR CI contains 1: no difference in odds. Do not reject H0.
  2. OR entire CI > 1: Odds in Case(or event) group > odds in control group. Reject H0
  3. OR entire CI < 1: Odds in Case (or event) group < odds in control group. Reject H0
70
Q

______________ tests make assumptions about the parameters of the population distribution from which the sample data is taken.

A

PArametric

71
Q

______________ tests do NOT make assumptions about data distribution. However, they do require groups to have approximately the same dispersion.

A

Non-parametric

72
Q

When should non-parametric test be used?

A
  1. Data don’t seem to follow distribution
  2. Assumptions underlying parametric tests are not met
  3. Sata appear to be very skewed
  4. Data has significant outliers
73
Q

Types of parametric tests?

A
  1. Paired t-test
  2. Unpaired t-test
  3. Pearson correlation
  4. One way ANOVA
74
Q

Types of non-parametric equivalent?

A
  1. Wilcoxon Rank sum test
  2. Mann-whitney u test
  3. spearman correlation
  4. Kruskal Wallis test
75
Q

What is paired variable?

A

Compare for 2 different variables for same group

76
Q

What is dependent variables?

A

Compare outcomes on the same variable fro 2 different groups

77
Q

Which is which?

A

a: one tailed (5%)
b: 2 tailed (2.5%)

78
Q

What are t-tests? Types?

A

Test for differences between means, larger the stat the tmore difference between the groups

Independent sample: compares means of 2 groups
Paired: compares means from same group at different times
One sample: compares the mean of one group to known mean

79
Q

Degrees of freedom?

A

A measure of the amount of independent data that can be used to estimate a parameter

The probability distributions of the test statistics of hypothesis tests

Number of data points which are free to vary

80
Q

What is degrees of freedom dependent on?

A

1 Number of groups compared
2. Number of parameters needed to estimate the standard deviation

81
Q

How do you test for independence?

A
  1. Random samples
  2. Categorical data (counts)
  3. Non-Parametric
  4. Tests whether a categorical variable is related to another
82
Q

How do you test for goodness of fit?

A
  1. Random samples
  2. Categorical data (counts)
  3. Non-Parametric
  4. Tests whether data is representative of the full population.
  5. Compares observed data to a theoretical model
83
Q

What does chi square observe?

A

frequency with expected frequency

84
Q

What is survival analysis?

A

Branch of stats for analyzing the expected duration of time until an event occurs

Must deal with censored data

85
Q

What are the causes of censored data?

A
  1. event doesn’t occur during study period
  2. subject lost to follow up
  3. subject dies from something other than studied cause
86
Q

What is kaplan-meier analysis? Assumptions?

A

Non-Parametric survival analysis method – no assumptions about how event probability changes over time.

  1. Censoring is independent of event probability
  2. Survival probabilities are comparable in early and later recruited subjets
  3. Censoring is not more likely in one group than another
87
Q

What is hazard rate?

A

The relative risk of complications based on comparison of event rates.

88
Q

What is intention to treat?

A

Every patient randomized enters the primary analysis

89
Q

What is per-protocol?

A

Analysis includes only those patients who strictly adhered to the protocol

Identifies effect under ideal conditions

90
Q

When would you use a forest plot?

A

Key way data from multiple papers is summarized in a single image