Week 5 Flashcards

1
Q

Quantitative data: Identifying the appropriate classification scheme for variables can help determine what?

A

what types of statistical methods are appropriate for describing data or for making inferences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Quantitative data: Identifying the appropriate classification scheme for variables can help determine what types of statistical methods are appropriate for describing data or for making inferences. Give 2 examples of these schemes and describe

A

1) Discrete
Variables have a “yes” or “no” value
Alive or dead
Number of hospitalizations
2) Continuous
Infinite number of values within a given range
Age
Body weight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the exceptions to quantitative data rules?

A

Exceptions to the rules:
Ordinal data may imply an underlying continuous scale when large numbers of categories are present.
Example: using a pain scale of 0 – 100
Interval/ratio data may be discrete if the variable can only take on integer values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

List 3 measures of central tendency (Quantitative Data)

A

Mean – average; applicable with interval and ratio values
Median – represents central tendency better than mean if outliers are present
Mode – not commonly used in clinical research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

List and describe some measures of dispersion (Quantitative Data)

A

Range – difference between the lowest value in the data from the highest value in the data
Interquartile range (IQR) – restricted to values that lie within the middle 50% of the distribution
Variance – how far the values of a variable lie from the mean
Standard deviation – square root of variance
Coefficient of variation – standard deviation / mean
Skewness – indicates that the data are not evenly distributed around the mean, in other words, more of the data are concentrated to either the right or the left of the mean value and the “tail” on the opposite side of the mean is longer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quantitative Data: List some ways to organize and visualize

A

Tables: Frequency table
Plots: Box & whisker plot
Graphs
Charts: Bar Chart, Pie Chart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe box and whisker plots

A

IQR = box = middle 50% of the distribution
Median noted by line in the middle of the box
Minimum and maximum values noted by whiskers
Mean noted by diamonds
Useful when making comparisons across different groups that may not have equivalent underlying distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data: Define rates and give examples

A

Proportions over a specific time-period with a base (multiplier)
Morbidity and mortality
Incidence and prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe morbidity and mortality rates

A

1) Centers for Disease Control and Prevention’s National Center for Health Statistics Data:
-2,437,163 deaths in the United States in 2009
-US population of 307,024,820 yields a mortality rate of 793.8 deaths per 100,000 population in the United States in 2009

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define incidence and prevalence

A

1) Incidence: measures the risk of developing an outcome
2) Prevalence: measures the probability of having an outcome
-More appropriate to say “point prevalence” compared to “prevalence rate” since this is a proportion and not a rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True or false: You can’t have a test that’s 100% specific and 100% sensitive

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe diagnostic data

A

1) How do you truly identify patients that have an outcome (true positive) and avoid detection of an outcome in patients that do not have it (false positive)?
2) Sensitivity (true positive rate)
3) Specificity (true negative rate)
1 – false positive rate = specificity (true negative rate)
4) Most tests used to diagnose diseases have a sensitivity of 80% and specificity of 90%
-Sensitivity is gained at the expense of specificity and vice versa
-Receiver operating characteristic curves further define this tradeoff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe

A

The area under the curve for ROC-1 is closer to 1 and further from the chance diagonal. It would identify a larger number of true positives.

Point A: larger proportion of patients with the outcome are detected and there are more false positives

Point B: detects less true positives and less

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Diagnostic data: Describe the ADA Standards of Care (2022)

A

Is a BMI > = 25 kg / m2 an appropriate risk factor for diabetes for all ethnicities?
“…BMI cut points fall consistently between 23 and 24 kg/m2 (sensitivity of 80%) for nearly all Asian American subgroups (with levels slightly lower for Japanese Americans).”
“An argument can be made to push the BMI cut point to lower than 23 kg/m2 in favor of increased sensitivity; however, this would lead to an unacceptably low specificity (13.1%).”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe Normal distribution (bell curve)

A

1) z distribution / z scores
-Mean = 0
-SD = 1
-Continuous variables
2) t distribution / student’s t distribution
-Modification to the standard normal distribution (or z distribution) when the sample size is relatively small
-Useful whenever the actual population standard deviation is not known or when a good estimate is not available
-Continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Statistical Distribution (other than standard distr.); describe:
1) Binomial distribution
2) F distribution
3) Poisson distribution
4) Gamma distribution

A

1) Mutually exclusive categories of data
2) Analysis of variance (ANOVA) and linear regression analysis
3) Useful for detecting rare outcomes
4) Variable of interest is interval or ratio but is very highly skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Stat. distribution: List the 3 main bullet points of the Central limit theorem

A

1) The mean of all sample means will equal the population mean
2) The standard deviation of the sampled means is equal to the standard error of the mean
3) As the sample size increases, the distribution of the sample means will approach a normal distribution regardless of the underlying distribution of the variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define and describe Statistical inference. When are latin or greek letters used?

A

1) Process of analyzing data from a sample and using those results to infer the related values in the source or target population
2) Data related to the population of interest are referred to as parameters and are usually represented by Greek letters
3) Data from a sample is referred to as a statistic and is represented by Latin letters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Inference:
1) Define stat. estimation
2) Define hypothesis testing

A

1) Process by which estimates of the population parameters are generated from sample statistics with a focus on generating precise estimates with minimal bias
2) Making a conclusion about a hypothesized difference or relationship using observations from the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Inference: Define and describe Statistical estimation

A

1) Mean, median, and standard deviation are basic examples of estimation
2) Point estimation: One single value is estimated for the statistical quantity of interest (e.g., mean)
3) Interval estimation
-Confidence intervals
-Inference of the precision of an estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Hypothesis testing: What makes a good hypothesis?

A

Declarative
Describes a relationship between two or more variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Hypothesis testing: What is the Null hypothesis (HO)? Describe

A

1) No relationship or no difference between the variables of interest
2) Conclusions of studies are made with respect to the null hypothesis
Reject
Fail to reject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Hypothesis testing: what is the opposite of the null hypothesis?

A

Alternative hypothesis (HA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Hypothesis testing: What are the 2 types of errors? What do they have in common?

A

Type I error (α)
Type II error (β)
Both of these errors represent quantities that the researcher sets acceptable levels for when designing the study before (ad hoc) data are analyzed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Inference: Hypothesis testing: Describe Nondirectional tests
1) Two-sided or two-tailed 2) Comparing means between two groups 3) Difference Null hypothesis = no difference between variables (i.e., control vs. placebo) If α = 0.05, 2.5% will be distributed to each side / tail 4) Equivalence Null and alternative hypotheses must be “reversed” compared to a test of difference Assess using two one-sided statistical tests If α = 0.05, then CI = 90% for equivalence testing
26
Inference: Hypothesis testing: Describe directional tests
One-sided or one-tailed Comparing means between two groups Superiority Non-inferiority
27
List 3 approaches to hypothesis testing
Traditional testing Confidence intervals Bayesian approach
28
Describe the steps of traditional hypothesis testing
Step 1: Convert research question into null and alternative hypothesis Step 2: Select appropriate statistical test Step 3: Select α Step 4: Calculate test statistic Step 5: Draw a conclusion
29
Describe step 1 of traditional hypothesis testing
Step 1: Convert research question into null and alternative hypothesis Look closely to see whether two-sided or one-sided statistical tests were used to determine if the hypothesis testing approach was directional or nondirectional since research questions may be stated as directional Two-sided approach is more conservative
30
Describe step 3 of traditional hypothesis testing
Ad hoc 0.05 or 5% is standard
31
What step of traditional hypothesis testing do computers do?
Step 4: Calculate test statistic
32
Describe step 5 of traditional hypothesis testing and give an example
1) p-values: if less than ad hoc α, you would consider the results statistically significant 2) Example p = 0.027 and α = 0.05 This means that there is a 2.7% chance (probability) of finding a result of this magnitude (or larger) assuming the null hypothesis is true Smaller p-values indicate increasingly strong evidence against the null p = 0.45 does not suggest strong evidence against the null while p = 0.001 would denote strong evidence against the null hypothesis p-value does not indicate strength or size of finding
33
Hypothesis testing: Describe Confidence intervals
1) No difference between variables = 1 2) If the interval contains the null value of interest, then the null hypothesis is not rejected at a given significance level 3) If the interval does not contain the null value, the null is rejected at a given significance level
34
Bayesian hypothesis testing is not __________ used in biomedical literature
commonly
35
Bivariate Analysis: 1) What is it? 2) What does it look at?
1) Analysis with just two variables 2) Looking at outcomes between two groups (control and experimental
36
Give a hypothetical example of a study with bivariate analysis
1) “Randomized controlled trial: effect of standards of care with exercise compared to standards of care without exercise on hemoglobin glycation for patients with type two diabetes mellitus.” -Null hypothesis – there is not difference in HbA1c results between the group exercising and the group not exercising P – patients with type II DM at your healthcare facility I – exercise + standards of care C – standards of care O – HbA1c
37
Describe Bivariate analysis
1) Independent variable is discrete (nominal) Standards of care Standards of care + exercise 2) Dependent variable is continuous (ratio) 3) Use nondirectional statistical test Null hypothesis says there is no difference between groups 4) Preferred test is independent groups t-test; tests the differences in the means of two groups.
38
Describe the types of continuous data
39
Bivariate analysis: What if we had three arms in our study? Like: Standards of care Standards of care + exercise Standards of care + diet counseling
Analysis of variance (ANOVA) is useful to compare the means of three groups An ANOVA used to compare two groups would produce the same results as a nondirectional t-test
40
What if we want to compare the proportions (nominal data) of patients at goal for two groups (standards of care & standards of care + exercise)?
1) Use chi-square test of homogeneity; nondirectional test of difference of proportions; will see 2x2 contingency tables to analyze data Can also be used to look at proportions of two or more groups 2) Use Fisher exact test if sample size is too small How do I know if sample size is too small?
41
Describe the types of discrete data
42
Example of bivariate analysis: What if you want to compare standards of care to standards of care + diet counseling + exercise? And now, you want to measure baseline A1c and A1c at the end of the study.
1) If the comparison groups used for the statistical comparison involve more than one measurement on the same patients, then the groups are considered dependent (the sets of observations are said to be paired) 2) The test statistic for comparing baseline to posttreatment involves differences in means between dependent groups; therefore, a dependent groups or paired t-test is used 3) If you have more than two groups, you can use repeated-measures ANOVA
43
HBA1c is __________ data. If you're just looking at end results, is it paired?
continuous; unpaired
44
Give 2 examples of additional statistical tests for bivariate analysis
Wilcoxon-Mann-Whitney Study differences between two independent groups on a variable that has an ordinal level of measurement Kruskal-Wallis Extends the Wilcoxon-Mann-Whitney test to more than two groups when the groups are independent
45
Bivariate analysis: How do you determine if the drop in A1c is a response associated with the amount of time spent exercising?
1) Pearson correlation coefficient (r) -Measure of how two variables are linearly related for continuous data -A coefficient of + 1 indicates a perfect positive linear relationship between the two variables -A coefficient of -1 indicates that there is perfect negative linear correspondence between the two variables -An r of 0 means that there is no overall linear relationship between two variables
46
Bivariate Analysis: Describe the Spearman rank correlation coefficient
Very similar to Pearson interpretation Can be used with interval and ratio data after transformation into ranks
47
Bivariate Analysis: The primary factors used to determine which statistical test to use are?
1) Independence or dependence of the groups (samples) 2) Level of measurement of the dependent (outcome) variable 3) Assumptions on which specific statistical tests are based
48
Quantitative: List and describe 2 kinds of continuous data
1) Interval data: body temperature; a change in 1 degree in either direction is the same defined interval; 80 degrees F is not twice as hot as 40 degrees F Ranked data with meaningful difference between numbers and no defined zero 2) Ratio data: body weight; a change in 1 pound in either direction is the same defined interval; 80 pounds weighs twice as much as 40 pounds Interval data with defined zero
49
Give 2 kinds of discrete data (Quantitative)
1) Nominal data: dead or alive 2) Ordinal data: Likert-scales -Ranked data
50
US population of 307,024,820 yields a mortality rate of _________ deaths per 100,000 population in the United States in 2009
793.8
51
Incidence and prevalence: ADA Standards of Care 2022 “The prevalence of ________________ therapy in hospitalized patients can approach 10%, and these medications can induce hyperglycemia in patients with and without antecedent diabetes.”
glucocorticoid
52
Differentiate sensitivity and specificity
Sensitivity (true positive rate) Specificity (true negative rate)
53
Give the highlights of ADA Standards of Care (2022)
1) “…BMI cut points fall consistently between 23 and 24 kg/m2 **(sensitivity of 80%)** for nearly all Asian American subgroups (with levels slightly lower for Japanese Americans).” 2) “An argument can be made to push the BMI cut point to lower than 23 kg/m2 in favor of increased sensitivity; however, this would lead to an unacceptably **low specificity (13.1%)**.”
54
Central limit theorem says what?
As you increase the population, you get closer to the true population parameter
55
One generally accepted rule of thumb is a sample size of at least _____ is sufficiently large as long as there are only moderate departures from the normal distribution.
30 (ppl)
56
What is a type 1 error? What do you look at?
Type I error (α) Rejecting the null hypothesis when the null hypothesis is true P values
57
Define P values and describe
p-values Probability of finding the results of the study (or more extreme results); assuming the null hypothesis is true The smaller the p-value than the less likely the result occurred by random chance The level of confidence for a given study is generally related to the type I error rate such that the confidence level is one minus the type I error rate, or 1 – α. Thus, a type I error rate of 0.05 (or 5%) leads to a 95% confidence level.
58
Describe type 2 errors? What do you look at?
1) Type II error (β): Failing to reject the null hypothesis when the null hypothesis is false 2) Power: Probability of rejecting the null hypothesis given that the null hypothesis is actually false Tells us the likelihood of being correct Typical values for the type II error rate are 20% and 10%, which translate to power levels of 80% and 90%, respectively
59
Step 5 of trad. hypothesis testing: Draw a conclusion: p-values: if less than ad hoc α, you would consider the results statistically _____________
significant
60
Inference: Significance: Differentiate between statistical and clinical significance
1) Statistical: p-values 2) Clinical -Will the results of a study change how you practice? -A study can have statistical significance and no clinical significance -A study cannot have clinical significance if it does not have statistical significance -NNT and NNH are commonly considered in determining clinical significance
61
True or false: A study cannot have clinical significance if it does not have statistical significance
True
62
Describe the types of continuous data
63
Describe the types of discrete data