250A Final Flashcards

1
Q

Power = ….?

formula

A

Power = 1 - Beta

where Beta = type II error rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Ignoring NCP, it’s possible to calculate the a priori power in a two group independent samples t-test. What information would you need and how would you perform the hand calculation?

A

Information needed:
sample size, alpha level (to get critical value of t), predicted mean of alternative distribution
Calculation:
You would find the critical value in the null distribution, and then find the area to the right of that value in the alternative distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the noncentrality parameter, what is its general formula, and how does its value translate into the power of a statistical test?

A

The noncentrality parameter is how far the alternative distribution is offset from the null distribution – how different sampling dists are in null and alternative
Its general formula is: delta = d * sqrt(n)
Larger delta translates to more power of statistical test because delta is large when effect size and n are large, which both increase power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does alpha rate affect power of a statistical test?

A

As alpha increases, critical value moves closer to 0, decreases Beta, power increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does one-tailed vs two-tailed test affect power?

A

Power is higher for one tailed (bc alpha is larger on one side, critical value closer to 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does sample size affect power?

A

As sample size increases, SE decreases, which decreases overlap between the groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does effect size affect power?

A

As effect size increases, the means of the null and alternative distribution get farther apart, which means there is less overlap between them and power increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does what kind of test is used affect power?

A

Parametric tests are more powerful than non-parametric tests

Dependent are more powerful than independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is a one sample t test more powerful than a two sample t test (all else being equal)?

A

Because SE is lower

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is the variance sum law useful in computing a priori power in a dependent samples t test?

A

We can get a better estimate of the sample variance if we use both samples (remember we don’t use pretest var in effect size calculations) so we can use variance sum law to help us out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Use variance sum law to explain why a dependent samples t-test is more powerful than an independent samples t test. What is the critical role of correlation between treatment conditions?

A

You subtract off 2correlations1*s2 so groups that are more correlated will subtract a higher value making SE smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a correlation (e.g., r = .25) tell you about the relationship between two variables?

A

r tells you the degree of linear relation between x and y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does covariance relate to correlation?

A

Covariance is the average product of deviation scores, and tells you the degree to which the two variables are related. But, its scale changes as a function of S_x * S_y, making it uninterpretable.
To get correlation, we divide: cov(x, y) / S_x * S_y (max of covariance). Correlation is standardized covariance! Since covariance is avg product of deviation scores, and deviation scores standardized are z scores, r is the average product of the z scores! – how well an obs’ Z scores on X and Y match

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain what positive, negative, and zero correlation means.

A

Positive correlation: when an obs is above the mean on X, it also tends to be above the mean on Y
Negative correlation: when an obs is below the mean on X, it tends to be above the mean on Y
0 correlation: when an obs is above the mean on X, there is no systematic expectation for where it is on Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is adjusted r and why is it needed?

A

Adjusted r is need to correct for small n. We need it because when n is small, r is a biased estimator of the population correlation rho.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does correlation squared mean?

A

R^2 is the percentage of variance in Y that can be explained by variations in the levels of X.
We reduce uncertainty in Y by knowing X by R^2%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When we test the null hypothesis that rho ne 0, we need to perform Fisher transformations. Why?

A

When rho ne 0, the sampling distribution of r is not normal and SE is not easily estimated
So we transform r to r’, which is normally distributed around rho’ and has an estimable SE!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How would you calculate a confint around an observed correlation with Fishy transformations?

A

First solve for confidence limits on rho’ and then convert back to rho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the general effect of dichotomizing a continuous variable on a correlation?

A

Dichotomizing: e.g., median split into high group and low group on a continuous variable
This reduces correlation and power because you’ve made your effect smaller.
The more extreme the split, the lower r will be compared to what it would have been if we’d left the variable continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How is correlation affected by range restriction?

A

Restricting range results in lower correlation because there are fewer values at the extremes and we need values at extremes to get a strong correlation (they have higher Z scores!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How is correlation affected by mixed (heterogeneous) populations?

A

This is when the groups vary in mean and variance

Combining heterogenous groups inflates the correlation and could bring about a spurious correlation! le gasp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How is correlation affected by outliers?

A

Correlation is super sensitive to outliers because they have big Z scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the conventions for effect sizes for correlations?

A

Effect size conventions:
Small = .10
Medium = .30
Large = .50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Under what circumstances may you want to transform a Cohen’s d from an experiment into a correlation coefficient? What would such a correlation indicate?

A

We would do this if we wanted to compare effects from an experiment to an observational study – compare results of two different kinds of studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What variances are between (treatment) variance and within (error) variance estimating?

A

Between (treatment) variance: estimates the total variance of all groups
Within (error) variance): estimates the variance within each group. Since we assume all groups have equal variance, this is an estimate of population variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the factors that influence between and within variance estimates?

A

Within: affected by variability of each group, but UNAFFECTED by variability between groups, e.g., whether or not null is true
Between: affected by sampling error (error variance = within variance estimate) and variability between groups, e.g., whether or not the null is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What’s the expected value of MSE and MSTreatment if null is true? If null is false?

A

If the null is true, we expect the estimates to be the same. If the null is false, MSTreatment will be larger.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does sigma_t^2 refer to, and what role does it play in expected mean squares?

A

It is the variation of the true populations’ means, and it is a component of MStreat. If the null is true, then the means are all equal and do not vary. So when the null is true, it = 0 and causes MStreat to = MSerror

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Why is MSwithin also called MSerror?

A

Because it represents the error you’d get if you predicted each value by its group mean
it’s the variance of scores around group means

30
Q

What are the expected values for the F ratio when the null is true or false?

A

When null is true, E(F) = 1

When null is false, E(F) > 1 (usually greater than 3)

31
Q

What does it mean to say that SStotal can be partitioned into two parts in one way ANOVA? Why can’t we divide up total variance?

A

We can break up the deviation of each observation from the grand mean into two components: the observation’s deviation from its group mean, and group’s deviation from the grand mean.
Only SS are additive, we can’t add variances unless they’re based on the same degrees of freedom.

32
Q

What assumptions are there in one-way ANOVA?

A
  1. Homogeneity of variance: we need this to form a pooled estimate of population variance
  2. Normality of errors: assume the DV is normally distributed around its mean for each group. We need this for the F distribution to work – normal is the only dist where mean and var are independent, so our test of means will be unaffected by size of variance if this assumption holds.
  3. Independence of observations/errors: this is need for SEs to be correct
33
Q

What are the options for addressing violations in assumptions? What are general findings regarding the consequences of violating these assumptions?

A

You can do a Levene test to evaluate homogeneity of variance. If this is violated, Box found another way to calculate F. Or use Welch’s F.
If you violate independence, your DFwithin and MSwithin are wrong!
Generally ANOVA is pretty robust to violating assumptions

34
Q

Correspondence between:

Between, Within, Error, and Treatment

A
Treatment = Between
Error = Within
35
Q

Properties of both r-family effect size indices in ANOVA.

A

eta: proportion of reduction in error by knowing treatment group. How much overall variability in dependent variable is attributed to treatment effect. It’s biased too high, but very intuitive.
omega: unbiased if you have fixed effects. It’s a better estimated but no intuitive appeal

36
Q

What are the conventions for a d-family effect size in one way ANOVA? Explain how effect size is defined in this context.

A

Conventions:
0.10, 0.25, 0.40
Effect size is defined as the sd of expected population means divided by the expected within-cell variance (MSE)

37
Q

Describe the process of calculating power in a one-way ANOVA.

A

NCP is still displacement of alternative from null dist. In null, F = 1 so NCP is displacement of alternative from 1 (depends on true differences between means).
Then we get phi’ or cursive f, which is an effect size measure. It’s the sd of expected pop means divided by expected within-cell variance (MSerror)
Then we get phi from phi’ and sqrt(n)

38
Q

Define error rate per comparison and familywise error rate.

A

PC: probability of type I error on any given comparison
FW: probability that a family of conclusions will contain at least one Type I error rate
if we only do one comparison, PC = FW

39
Q

According to the Bonferroni inequality, what is the max value that FW error rate can be?

A

PC <= FW <= num comparisons * alpha

40
Q

Can a t-test be rejected but the omnibus null F-test not? Why would this happen?

A

This can happen because F test considers all group means at the same time and dilutes differences among groups across the number of df for groups, which dilutes overall F when there is difference between a one group and the remainder

41
Q

Describe the distinction between a priori planned comparisons and post-hoc tests. Why can’t we just do a bunch of independent group t-tests?

A

A priori comparisons are chosen before the data are collected
Post hoc happen after experimenter has collected the data and looked at the means and noted where larger/smaller differences may be.
We can’t just do a bunch of t tests because that would inflate Type I error rate.

42
Q

What is the advantage of doing follow-up tests in one-way ANOVA relative to 2 group t-test?

A

You get a better estimate of population variance with ANOVA than with a two sample t-test and also more degrees of freedom so ya get more power!

43
Q

All ANOVA follow-up tests, whether planned or post hoc, pairwise, or complex, are some variant of a t-test. In what ways is that statement true?

A

Well, when you planned comparisons or post hoc comparisons of means, those are just t-tests.
t tests are just a special case of complex tests where you compare one group or set of groups with another set of groups.

44
Q

What is a linear contrast?

A

First linear combination of means = weighted sum of treatment means
When we impose the restriction that the sum of the weights = 0, we get a linear contrast
This can represent the difference between the means of treatment 1 and the average of the means of treatments 2 and 3.

45
Q

What are orthogonal contrasts?

A

Orthogonal contrasts are contrasts where the vectors containing weights are orthogonal to one another – knowing that x1 > (x2 + x3)/2 does not tell you anything about whether x3 > x2 or x4 > x5 –> independent (orthogonal)
SScontrasts sum to SStreat

46
Q

How do you specify a contrast? What is the standard set and why is it important?

A

To specify a contrast choose coefficients whose sum = 0. Form two sets of treatments and for weights use the reciprocal of the number of treatment groups in each set. Arbitrarily make one of them negative and there ya go.
The standard set is the set of coefficients whose sum of absolute values = 2 and it’s important because it’s relevant when estimating effect sizes.

47
Q

What is the tree method?

A

Let’s say we have five treatment groups. The omnibus F test where we compare all group means is the trunk. Then break it into two sets: X1 & X2 vs X3 & X4 & X5 – these are branches.
NOW YOU CAN’T COMPARE SHIT FROM SEPARATE BRANCHES FOR SOME STUPID REASON
So now we break down one limb into further contrasts: X3 vs X4 & X5.
Now we break down the final limb into one more contrast: X4 vs X5
for these to be orthogonal, the pairwise products of coefficients must sum to 0

48
Q

Are non-orthogonal contrasts valid and interpretable? Are there any constraints that must be placed on the interpretation?

A

Yes, they are valid and interpretable. The only constraint is that the contrasts are not independent so they are slightly biased. Interpret with caution

49
Q

Describe the rationale behind the Bonferonni-Dunn t’ and how this test is applied to a real data problem.

A

Rationale: if you perform c tests, the FW error rate cannot exceed a*c. So if we set a new alpha level equal to a/c for each comparison, a will be the new maximum FW error rate!
How is this applied to real data? You do the standard t-test procedure and use the formula for t’. But get your critical t from software cause ain’t nobody got time for them kinds of t tables. Compare the p-value to t-crit for a/c.

50
Q

What is the dilemma regarding choice for a denominator for effect size for contrasts?

A

There are 3 possible estimates for sd: 1) sqrt(MSE), 2) sqrt(average of variances in groups being contrasted), or 3) use one group as control group and use that sd
Usually use MSE

51
Q

Describe rationale behind HSD test and how it differs from studentized range (q).

A

HSD uses q for its comparisons, but q_HSD is always the max value of q_r. So if there are 5 means, all differences are tested as if they are 5 steps apart. In HSD, you do ALL THE COMPARISONS, and in a q test you just do one. Testing at maximum r fixes FW error rate at alpha against all possible nulls, not just the complete null (lose some power)

52
Q

How is HSD test applied to a real data problem?

A

First arrange means in order of increasing magnitude. We can get q_r from the appendix. Then plug that in to the equation and solve for the difference between means that would yield a significant difference. Then find all the difs between means and there ya go.

53
Q

In parametric tests, why do we need assumptions about normality and equality of variance?

A

We need to assume normality so that we know can use CLT and use sample mean and SD as estimators of population parameters
We need to assume homogeneity of variance to pool variance estimates (?)

54
Q

What are bootstrapping and randomization tests used for? What assumptions do these procedures make?

A

Don’t make any a priori assumptions about the shape of the population so validity of test is not affected by violations of normality
Bootsrapping assumption: the population is distributed as our sample data
We can use these to get confints for statistics that we don’t have formulas for (e.g. Cohen’s d, median) whose SEs are hard to derive

55
Q

What is the difference between sampling from the data with and without replacement? When would each be used?

A

Sampling with replacement: used in bootstrapping. In this case it’s possible to sample the same data point more than once.
Sampling w/o replacement: used in resampling methods. It’s not possible to sample the same data point more than once.

56
Q

Compare ordinary parametric confint for a mean (median, d, etc.) and bootstrap approach.

A

In ordinary parametric approach: Get data. Get estimate of your parameter. Use da formular: sampleStatistic +- criticalValue * SE_sampleStatistic
In bootstrapping:
1. Assume pop is distributed like your sample.
2. Draw large number of new samples of size n from that population
3. For each sample, compute the statistic of interest
4. Then examine the empirically derived sampling distribution and find relevant cut off values.

57
Q

How is a randomization test is used to derive a null hypothesis distribution when there is a two-group dependent t-test situation?

A

Dependent –> interested in difference scores. Observation: if null is true and x1 = x2, a difference score is equally likely to be positive and negative.
So resample w/o replacement repeatedly, randomly assigning positive and negative values to the differences. Calculate whatever statistic you’re interested in each time, and construct the null distribution of that statistic.

58
Q

How is a randomization test is used to derive a null hypothesis distribution when there is a two-group INdependent t-test situation?

A

If the null is true, each obs is equally likely to have come from treatment or control group.
So resample w/o replacement, randomly assigning obs to each group (use appropriate sample size). Calculate your desired statistic each time you resample. Plot whatever you get and there ya go, ya got the null sampling distribution of whatever the fuck ya want.

59
Q

How would you performa a randomization test of the null hypothesis in ANOVA?

A

Resample w/o replacement, randomly assigning cases to conditions (keeping appropriate n_j in mind)
Calculate F each time and construct your empirical F distribution. Compare observed F to the empirical distribution and see how likely/unlikely it is.

60
Q

Why are traditional non-parametric tests like the WIlcoxon thought of as permutation (randomization) tests, as least with small n?

A

Because it permutes/randomizes ranks

61
Q

What null is tested in Wilcoxon rank-sum test?

A

Analog of independent t-test. The null is broader: two samples were drawn at random from identical populations (not just populations with the same mean)

62
Q

What null is tested in Wilcoxon matched-pairs signed ranks test?

A

Analog of dependent t test. The null is that the two samples were drawn from identical populations or symmetic populations with the same mean. Specifically: distribution of dif scores in population is symmetric around 0.

63
Q

What null is tested in Kruskal-Wallis one-way ANOVA? What is the sampling distribution under the null?

A

Analog of one-way ANOVA. Null: all samples drawn from identical populations

64
Q

What do we do when n is large for Wilcoxon tests?

A

Distribution of Ws approaches normal as sample size increases so we can calculate the mean and SE of the distribution. Then we can calculate a z statistic

65
Q

Why do we need to calculate Ws and Ws’ for Wilcoxon rank sum test?

A

Ws’ is needed when the smaller group has a larger sum of ranks

66
Q

How do you conduct a Wilcoxon matched-pairs sign test?

A

If there’s no dif between groups, we expect ~equal numbers of pos and neg difference scores of ~equal magnitude.
If there is a dif, we have two predictions: 1) most changes will be in the same direction. 2) changes in the opposite direction will be small.
Steps: 1. calculate dif scores for each pair. 2) rank dif scores without regard to sign. 3) assign sign of dif to ranks themselves. 4) sum positive and negative ranks separately.
test statistic: T = smaller of abs value of two sums
evaluated against tables
T is a discrete distribution so we

67
Q

How do you conduct a Wilcoxon matched-pairs sign test?

A

If there’s no dif between groups, we expect ~equal numbers of pos and neg difference scores of ~equal magnitude.
If there is a dif, we have two predictions: 1) most changes will be in the same direction. 2) changes in the opposite direction will be small.
Steps: 1. calculate dif scores for each pair. 2) rank dif scores without regard to sign. 3) assign sign of dif to ranks themselves. 4) sum positive and negative ranks separately.
test statistic: T = smaller of abs value of two sums
evaluated against tables
T is a discrete distribution so we have to choose our desired critical value

68
Q

How do you conduct a Kruskal-Wallis one-way ANOVA?

A

Rank all scores w/o regard to group and sum ranks for each group. If null is true we expect sum of ranks to be ~equal for all groups.
Test statistic: H is a measure of how much Ri’s differ from one another
Evaluate this against chi-sq(df = k - 1)

69
Q

What is the sampling distribution under the null for Kruskal-Wallis?

A

Chi squared with degrees of freedom k - 1

70
Q

Is there any point in Kruskal-Wallis once you know how to do a randomization test or are they testing slightly different things?

A

They test the same shit. Kruskal-Wallis is a generalization of Wilcoxon-rank-sum, which is just a permutation of ranks.