Final: Ch 11-20 Flashcards

(173 cards)

1
Q

Numerical Variables from a Single Sample

When is Ȳ normally distributed?

A

whenever:
- Y is normally distributed, OR
- n is large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Numerical Variables from a Single Sample

If Ȳ is normally distributed, what can we convert its distribution to?

A

standard normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Numerical Variables from a Single Sample

What does a standard normal distribution do?

A

gives a probability distribution of the difference between a sample mean and the population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Numerical Variables from a Single Sample

What is used to calculate the confidence interval of the mean?

A

t-distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does a one-sample t-test do?

A

compares the mean of a random sample from a normal population, with the population mean proposed in a null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the hypotheses for a one-sample t-test?

A

H0: mean of the population is µ0
HA: mean of the population is not µ0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the degrees of freedom for a one-sample t-test?

A

df = n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the assumptions of a one-sample t-test? (2)

A
  • variable is normally distributed

- sample is a random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Tests that compare means have what type of variables?

A

one categorical and one numerical variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Paired vs. 2-sample t-tests

A

paired comparisons: allow us to account for a lot of extraneous variation

  • ie. before and after treatment
  • ie. upstream and downstream of power plant
  • ie. identical twins – one with treatment, one without treatment
  • ie. how to get earwigs in each ear out – compare tweezers to hot oil

2-sample comparisons: sometimes easier to collect data for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are paired comparisons?

A

data from the two groups are paired

  • each member of pair shares much in common with the other, except for the tested categorical variable
  • there is one-to-one correspondence between the individuals in the two groups
  • in each pair, there is one member that has one treatment/group and another who has another treatment/group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we used to compare two groups in paired comparisons?

A

use mean of the difference between the two members of each pair

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a paired t-test?

A

one sample t-test on the differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does a paired t-test do?

A

compares mean of the differences to a value given in null hypothesis

for each pair, calculate the difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the number of data points in a paired t-test?

A

number of pairs – NOT number of individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the degrees of freedom for a paired t-test?

A

df = number of pairs - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the assumptions of a paired t-test?

A
  • pairs are chosen at random

- differences (NOT individuals) have normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does a 2-sample t-test do?

A

compares means of numerical variable between two populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the degrees of freedom for a 2-sample t-test?

A
df1 = n1 - 1
df2 = n2 - 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the assumptions of a 2-sample t-test? (3)

A
  • both samples are random samples
  • both populations have normal distributions
  • variance of both populations is equal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does Welch’s t-test do?

A

compares means of two groups without requiring the assumption of equal variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is different about the degrees of freedom for Welch’s t-test compared to other tests?

A

degrees of freedom is not necessarily an integer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Wrong Way to Make Comparison of Two Groups

A

“Group 1 is significantly different from a constant, but Group 2 is not. Therefore Group 1 and Group 2 are different from each other.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does Levene’s test do?

A

compares variances of two (or more) groups

use R to calculate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does the F test do?
most commonly used test to compare variances
26
Why do we usually use Levene's test instead of F test?
F test is very sensitive to its assumption that both distributions are normal
27
What are the 2 tests that compare variances?
- Levene's test | - F test
28
What 2 tests can conduct two-sample comparisons?
2-sample t-test or Welch’s t-test
29
What 2 tests can conduct two-sample comparisons?
2-sample t-test or Welch’s t-test
30
What does 2-sample t-test and Welch’s t-test both assume?
normal distributed variables
31
What assumption differs between 2-sample t-test and Welch’s t-test?
- 2- sample t-test assumes equal variance | - Welch’s t-test does NOT assume equal variance
32
What can you compare the means of two groups using? (2)
- mean of paired differences | - mean difference between two groups
33
What are the assumptions of all t-tests? (2)
- random sample(s) - populations are normally distributed (for 2-sample t-test only): populations have equal variances
34
What are methods to detect deviations from normality? (4)
- previous data / theory - histograms - quantile plots - Shapiro-Wilk test
35
What does normal data look like in a quantile plot?
points form an approximately straight line
36
What is the Shapiro-Wilk Test used for?
to test statistically whether a set of data comes from a normal distribution
37
What do you do when assumptions are not true? (5)
- if sample sizes are large, sometimes parametric tests work OK anyway - transformations - non-parametric tests - permutation tests - bootstrapping
38
Why do parametric tests on large samples work relatively well even for non-normal data?
means of large samples are normally distributed rule of thumb: if n > ~50, then normal approximations may work
39
What parametric test is ideal when assumptions are not true?
Welch’s t-test if sample sizes are equal and large, then even a 10x difference in variance is approximately OK – but Welch’s is still better
40
What are data transformations?
changes each data point by some simple mathematical formula then carry out the test on transformed data
41
When is log transformation useful? (3)
- variable is likely to be the result of multiplication or division of various components - frequency distribution of data is skewed right - variance seems to increase as mean gets larger (in comparisons across groups)
42
What are some other types of transformations? (3)
- arcsine transformation - square-root transformation - reciprocal transformation
43
What are characteristics of valid transformations? (3)
- require same transformation be applied to each individual - have one-to-one correspondence to original values - have monotonic relationship with original values (ie. larger values stay larger)
44
What should you consider when choosing transformations? (3)
- must transform each individual in the same way - transformed values must still carry biological meaning - you CANNOT keep trying transformations until P < 0.05
45
What do non-parametric ("distribution-free") methods assume?
assume less about underlying distributions
46
What do parametric methods assume?
assume a distribution or a parameter
47
What are some non-parametric tests? (3)
- sign test - RANKS - Mann-Whitney U test
48
What does the sign test do?
compares data from one sample to a constant
49
How is a sign test conducted?
- for each data point, record whether individual is above (+) or below (–) hypothesized constant - use binomial test to compare result to ½
50
Does sign test have high or low power?
has very low power – therefore it is likely to NOT reject false null hypothesis
51
What does it mean for a test to have high power?
more power → more information → higher ability to reject false null hypothesis
52
What is RANKS?
used by most non-parametric methods rank each data point in all samples from lowest to highest – ie. lowest data point gets rank 1, next lowest gets rank 2, …
53
What does the Mann-Whitney U test do?
compares central tendencies of two groups using ranks (equivalent to Wilcoxon rank sum test)
54
How is a Mann-Whitney U Test conducted?
1. rank all individuals from both groups together in order (for example, smallest to largest) 2. sum the ranks for all individuals in each group → R1 and R2 3. calculate U1: number of times an individual from population 1 has lower rank than an individual from population 2, out of all pairwise comparisons
55
What are the assumptions of the Mann-Whitney U Test? (2)
- both samples are random samples | - both populations have the same shape of distribution – only necessary when using Mann-Whitney to compare means
56
What is a permutation test used for?
for hypothesis testing on measures of association – can be done for any test of association between two variables
57
How is a permutation test conducted?
1. variable 1 from an individual is paired with variable 2 data from a randomly chosen individual – this is done for all individuals 2. estimate is made on randomized data 3. whole process is repeated numerous times – distribution of randomized estimates is null distribution
58
What does it mean if permutation tests are done without replacement?
all data points are used exactly once in each permuted data set
59
What are the goals of experiments? (2)
- eliminate bias | - reduce sampling error (increase precision and power)
60
What are some design features that reduce bias? (3)
- controls - random assignment to treatments - blinding
61
What is a control?
group which is identical to the experimental treatment in all respects aside from the treatment itself
62
What is random assignment?
individuals are randomly assigned to treatments
63
How does random assignment reduce bias?
averages out effects of confounding variables
64
What is blinding?
preventing knowledge of experimenter (or patient) of which treatment is given to whom
65
How do the results of unblinded studies compare to blinded studies?
unblinded studies usually find much larger effects (sometimes 3x higher) – shows the bias that results from lack of blinding
66
How can you reduce sampling error?
increase signal to noise ratio if ‘noise’ is smaller, it is easier to detect a given ‘signal’ – can be achieved with smaller s or larger n
67
What are some design features that reduce the effects of sampling error? (4)
- replication - balance - blocking - extreme treatments
68
What is replication?
carry out study on multiple independent objects
69
What is balance?
nearly equal sample sizes in each treatment
70
What is blocking?
grouping of experimental unit – within each group, different experimental treatments are applied to different units
71
How do extreme treatments reduce effects of sampling error?
stronger treatments can increase the signal-to-noise ratio
72
How does balance reduce effects of sampling error?
increases precision for a given total sample size (n1 + n2), standard error is smallest when n1 = n2
73
How does blocking reduce effects of sampling error?
allows extraneous variation to be accounted for – it is therefore easier to see the signal through the remaining noise
74
Blocking
75
What does ANOVA (analysis of variance) do?
compares means of more than two groups asks whether any of two or more means is different from any other – is the variance among groups greater than 0?
76
How does ANOVA compare to a t-test?
like t-test, but can compare more than two groups
77
How does ANOVA compare to a t-test?
like t-test, but can compare more than two groups
78
What are they hypotheses for ANOVA?
H0: all populations have equal means (variance among groups = 0) HA: at least one population mean is different
79
What is ANOVA with 2 groups mathematically equivalent to?
two-tailed 2-sample t-test
80
In ANOVA, under the null hypothesis, why should the sample mean of each group vary?
because of sampling error
81
In ANOVA, what is the standard error?
standard deviation of sample means (when true mean is constant)
82
In ANOVA, if null hypothesis is not true, what should variance among groups be?
variance among groups should be equal to variance due to sampling error plus real variance among population means if at least one of the groups has a different population mean, we expect that variance between sample means can be captured by standard error
83
ANOVA What is k?
number of groups
84
ANOVA What is MSgroup?
mean squares group
85
ANOVA What is MSerror?
mean squares error
86
What is the test statistic for ANOVA?
F
87
ANOVA What should F be if null hypothesis is true?
1
88
ANOVA What is F if null hypothesis is false?
F > 1 (but must take into account sampling error – F calculated from data will often be greater than one even when null is true, therefore we must compare F to null distribution)
89
What is an ANOVA table?
convenient way to keep track of important calculations scientific papers often report ANOVA results with ANOVA tables
90
What are the assumptions of ANOVA? (3)
- random samples - normal distributions for each population - equal variances for all populations
91
What is the Kruskal-Wallis Test?
non-parametric test similar to a single factor ANOVA uses ranks of the data points
92
What is a factor?
categorical explanatory variable
93
What is multiple-factor ANOVA?
ANOVAs can be generalized to look at more than one categorical variable at a time - can ask whether each categorical variable affects a numerical variable - can ask whether categorical variables interact in affecting the numerical variable
94
Multiple-factor ANOVA Graphs
95
ANOVA What are fixed effects?
treatments are chosen by experimenter – not a random subset of all possible treatments - things we care about - ie. specific drug treatments, specific diets, season
96
ANOVA What are random effects?
treatments are a random sample from all possible treatments - things that can affect response variable, but we don’t care too much about - ie. family, location
97
ANOVA What is the difference in statistics for fixed or random effects for single-factor ANOVA?
no difference
98
What is 2-factor ANOVA?
test multiple hypotheses ie. no difference based on North and South alone
99
Multiple Comparisons What is the equation for probability of Type I error in N tests?
1 - (1-𝛼)^N ie. for 20 tests, probability of at least one Type I error is ~65% type 1 error rate for each test = 𝛼 Pr[not making type I error | null is true] = 1-𝛼 Pr[not making type I error on 2 tests | null is true] = (1-𝛼)(1-𝛼) = (1-𝛼)^N Pr[at least one type I error] = 1- (1-𝛼)^N
100
Multiple Comparisons What happens to the probability of type I error every time you do a test?
probability increases - do too many tests → probability gets too high - do more tests → will find something that is statistically significant due to chance
101
What is the Bonferroni Correction for multiple comparisons?
uses smaller 𝛼 value 𝛼' = 𝛼 / (number of tests)
102
What does the Tukey Kramer test do?
compares all group means to all other group means to find which groups are different from which others
103
When are Tukey-Kramer tests done?
after finding evidence for differences/variation among means with single-factor ANOVA
104
What are the hypotheses for Tukey-Kramer test?
H0: 𝜇1 = 𝜇2 H0: 𝜇1 = 𝜇3 H0: 𝜇2 = 𝜇3 etc.
105
What is the probability of making at least one Type I error in Tukey-Kramer test?
probability of making at least one Type 1 error throughout the course of testing all pairs of means is no greater than significance level (𝛼)
106
Tukey-Kramer Graph
107
Why do we use Tukey-Kramer instead of a series of two-sample t-tests? (3)
- multiple comparisons would cause t-tests to reject too many true null hypotheses - Tukey-Kramer adjusts for the number of tests - Tukey-Kramer also uses information about variance within groups from all the data, so it has more power than t-test with Bonferroni correction
108
What is the parameter for correlation?
⍴ (rho) value is between -1 and 1
109
What is the estimate for correlation?
correlation coefficient (r): describes relationship between two numerical variables
110
What is the coefficient of determination (r^2)?
describes proportion of variation in one variable that can be predicted from the other variable
111
What is covariance in relation to variance?
variance is subset of covariance
112
What are the assumptions of correlation tests? (3)
- random sample - X is normally distributed with equal variance for all values of Y - Y is normally distributed with equal variance for all values of X
113
Correlation What does it mean if ⍴ = 0?
- r is normally distributed with mean = 0 - every time sampling distribution is normal, use t when using estimated standard error - if ⍴ ≠ 0, there is asymmetry
114
What is Spearman's Rank correlation?
alternative to Pearson’s correlation that does not make so many assumptions
115
Correlation What is attenuation?
estimated correlation will be lower if X or Y are estimated with error
116
What does correlation depend on?
range
117
Are species independent data points?
NO
118
What is a similarity between correlation and regression?
both compare two numerical variables
119
What is a difference between correlation and regression?
each ask different questions: - correlation – symmetrical - regression – asymmetrical
120
What does regression do?
predicts Y from X (one variable from another)
121
What does linear regression assume? (3)
- random sample - Y is normally distributed with equal variance for all values of X, assuming variance for all values of X is the same - relationship between X and Y can be described by a line
122
Parameters of Linear Regression – graphs
123
What is the equation for the estimated regression line?
Y = a + bX
124
What is the least squares regression line?
best line that minimizes sum of squares for the residual
125
What is a residual?
residual = observed Y - predicted Y for every X value, Ŷ (predicted value of Y, by regression line) is value of Y right on the line
126
Regression What does the coefficient of determination (r^2) do?
predicts amount of variance in Y explained by regression line
127
Regression What do you need to be cautious about?
unwise to extrapolate beyond range of the data
128
What are the hypotheses for regression?
H0: 𝛽 = 0 HA: 𝛽 ≠ 0
129
Regression What is the degrees of freedom for residual?
df = n -2
130
What are confidence bands?
confidence intervals for predictions of mean Y
131
What are prediction intervals?
confidence intervals for predictions of individual Y
132
How can non-linear relationships be 'fixed' (turned linear)? (3)
- transformations - quadratic regression - splines
133
What do residual plots do?
help assess assumptions
134
What should the residual plot look like?
- mean population is right on the line, and there’s variance around it - residual should roughly be the same size across all values of X (should be centred around 0, with equal positives and negatives) - residual should be spread out across the line, and about the same distance from the line on average for every X
135
Polynomial Regression Why should you NOT fit a polynomial with too many terms? (3)
(sample size should be at least 7x the number of terms) - very unlikely that new X would fall on the line - tradeoff between fit and prediction error – would fit better with your particular data set, but would have larger prediction error
136
What does logistic regression do?
tests for relationship between numerical variable (as the explanatory variable) and binary variable (as the response variable) ie. does the dose of a toxin affect probability of survival? ie. does the length of a peacock's tail affect its probability of getting a mate?
137
What is publication bias?
papers are more likely to be published if P < 0.05 – causes bias in science reported in literature
138
What are computer-intensive methods for hypothesis testing?
- simulation | - randomization
139
What are computer-intensive methods for confidence intervals?
bootstrap
140
What is simulation?
simulates sampling process on computer many times – generates null distribution from estimates done on simulated data computer assumes null hypothesis is true
141
What is the equation for likelihood?
L(hypothesis A | data) = P[data | hypothesis A]
142
What does likelihood NOT care about?
other data sets – ONLY cares about the specific data set we have
143
What does likelihood capture?
captures level of surprise prefer models that make data less surprising, and have higher likelihood
144
Does likelihood consider more than one possible hypothesis?
yes
145
What is the law of likelihood of a particular data set?
supports one hypothesis better than another if likelihood of that hypothesis is higher than likelihood of the other hypothesis therefore we try to find the hypothesis with maximum likelihood (least surprising data) – all estimates we have learned so far are also maximum likelihood estimates
146
What are the 2 ways to find the maximum likelihood?
- calculus | - computer calculations
147
How to Find Maximum Likelihood Calculus
ie. maximum value of L(p=x) is found when x = ⅜ note that this is the same value we would have gotten by methods we already learned
148
How to Find Maximum Likelihood Computer Calculations
1. input likelihood formula to computer 2. plot value of L for each value of x 3. find largest L
149
What does hypothesis testing by likelihood do?
compares likelihood of maximum likelihood estimate to null hypothesis use log-likelihood ratio
150
What is the test statistic for hypothesis testing by likelihood?
ꭓ^2 = 2 (log likelihood ratio)
151
What is the degree of freedom for hypothesis testing by likelihood?
df = number of variables fixed to make null hypothesis
152
When producing a 95% confidence interval for the difference between the means of two groups, under what circumstances can a violation of the assumption of equal standard deviations be ignored?
two-sample t-tests and confidence intervals are robust to violations of equal standard deviations as long as: - sample sizes of the two groups are roughly equal - standard deviations are within three times of one another.
153
What is the justification for including extreme doses well outside the range of exposures encountered by people at risk in a dose-response study on animals of the effects of a hazardous substance? What are the problems with this approach?
- extreme doses increase power, and so enhance the probability of detecting an effect - however, effects of a large dose might be very different from effects of a smaller, more realistic dose - if an effect is detected, then studies of the effects of more realistic doses would be the next step
154
What does randomization do?
removes effects of confounding variables
155
What does blinding do?
avoids unconscious bias
156
What happens if a study has a poor control?
increases possibility of confounding by unmeasured variables
157
What are planned vs. unplanned comparisons?
unplanned comparisons – intended to search for differences among all pairs of means planned comparisons – must be few and identified as crucial in advance of gathering and analyzing the data
158
The largest pairwise difference between means, that between the “medium” and “isolated” treatments, is statistically significant. How is this possible, given that neither of these two means is significantly different from the means of the other two groups?
failure to reject a null hypothesis that the difference between a given pair of means is zero does not imply that the means are equal, because power is not necessarily high, especially when the differences are small if the means of the “medium” and “isolated” treatments differ from one another, then one or both of them must differ from the means from the other two groups, but we don’t know which
159
What quantity would you use to describe the fraction of the variation in expression levels explained by group differences?
R^2
160
Earwig density on an island and the proportion of males with forceps are estimates, so the measurements of both variables include sampling error. In light of this fact, would the true correlation between the two variables tend to be larger, smaller, or the same as the measured correlation?
sampling error in the estimates of earwig density and the proportion of males with forceps means that true density and proportion on an island are measured with error measurement error will tend to decrease the estimated correlation therefore, the actual correlation is expected to be higher on average than the estimated correlation.
161
How do you analyze assumptions of linear regression in scatter plot?
- residuals are symmetric and don’t show any obvious non-normality - variance of the residuals does not appear to change greatly for different values of X
162
What is a least squares regression line?
minimizes the sum of squared differences between the predicted Y-values on the regression line for each X and the observed Y-values
163
What are residuals?
differences between predicted Y-values on the estimated regression line, and the observed Y-values
164
What does the MSresidual measure?
variance of the residuals
165
Linear Regression What does R^2 measure?
fraction of the variation in Y that is explained by X
166
The data set depicted in the graph includes one conspicuous outlier on the far right. If you were advising the forensic scientists who gathered these data, how would you suggest they handle the outlier?
- first, check the data to ensure this individual was not entered incorrectly - perform the analysis with and without the outlier included in the data set to determine whether it has an influence on the outcome - if it has a big influence, then it is probably wise to leave it out and limit predictions to the range of X- values between 0 and about 200 (and urge them to obtain more data at the higher X-value)
167
What do confidence bands measure?
give the confidence interval for the predicted Y for a given X
168
Which bands would provide the most relevant measure of uncertainty?
prediction interval, because it measures uncertainty when predicting Y of a single individual
169
What is ANCOVA?
(analysis of covariance) compares many slopes
170
What are the hypotheses of ANCOVA?
H0: 𝛽1 = 𝛽2 = 𝛽3 = 𝛽4 = 𝛽5… (multiple null hypotheses) HA: at least one of the slopes is different from another
171
What is bootstrapping?
method for estimation (and confidence intervals) - often used for hypothesis testing too - often used in evolutionary trees
172
What is the method for bootstrapping?
- for each group, randomly pick with replacement an equal number of data points, from data of that group - with this bootstrap dataset, calculate bootstrap replicate estimate
173
Why are paired samples analyzed differently than separate samples?
two individuals in a pair share many things in common with each other but differ from members of other pairs whatever variation these shared differences causes in the response variable is factored out in the difference between them by looking at the differences, we potentially avoid much of the error variance in the data separate samples do not share these properties