250A Final Flashcards
Power = ….?
formula
Power = 1 - Beta
where Beta = type II error rate
Ignoring NCP, it’s possible to calculate the a priori power in a two group independent samples t-test. What information would you need and how would you perform the hand calculation?
Information needed:
sample size, alpha level (to get critical value of t), predicted mean of alternative distribution
Calculation:
You would find the critical value in the null distribution, and then find the area to the right of that value in the alternative distribution
What is the noncentrality parameter, what is its general formula, and how does its value translate into the power of a statistical test?
The noncentrality parameter is how far the alternative distribution is offset from the null distribution – how different sampling dists are in null and alternative
Its general formula is: delta = d * sqrt(n)
Larger delta translates to more power of statistical test because delta is large when effect size and n are large, which both increase power
How does alpha rate affect power of a statistical test?
As alpha increases, critical value moves closer to 0, decreases Beta, power increases.
How does one-tailed vs two-tailed test affect power?
Power is higher for one tailed (bc alpha is larger on one side, critical value closer to 0)
How does sample size affect power?
As sample size increases, SE decreases, which decreases overlap between the groups
How does effect size affect power?
As effect size increases, the means of the null and alternative distribution get farther apart, which means there is less overlap between them and power increases
How does what kind of test is used affect power?
Parametric tests are more powerful than non-parametric tests
Dependent are more powerful than independent
Why is a one sample t test more powerful than a two sample t test (all else being equal)?
Because SE is lower
How is the variance sum law useful in computing a priori power in a dependent samples t test?
We can get a better estimate of the sample variance if we use both samples (remember we don’t use pretest var in effect size calculations) so we can use variance sum law to help us out
Use variance sum law to explain why a dependent samples t-test is more powerful than an independent samples t test. What is the critical role of correlation between treatment conditions?
You subtract off 2correlations1*s2 so groups that are more correlated will subtract a higher value making SE smaller
What does a correlation (e.g., r = .25) tell you about the relationship between two variables?
r tells you the degree of linear relation between x and y.
How does covariance relate to correlation?
Covariance is the average product of deviation scores, and tells you the degree to which the two variables are related. But, its scale changes as a function of S_x * S_y, making it uninterpretable.
To get correlation, we divide: cov(x, y) / S_x * S_y (max of covariance). Correlation is standardized covariance! Since covariance is avg product of deviation scores, and deviation scores standardized are z scores, r is the average product of the z scores! – how well an obs’ Z scores on X and Y match
Explain what positive, negative, and zero correlation means.
Positive correlation: when an obs is above the mean on X, it also tends to be above the mean on Y
Negative correlation: when an obs is below the mean on X, it tends to be above the mean on Y
0 correlation: when an obs is above the mean on X, there is no systematic expectation for where it is on Y
What is adjusted r and why is it needed?
Adjusted r is need to correct for small n. We need it because when n is small, r is a biased estimator of the population correlation rho.
What does correlation squared mean?
R^2 is the percentage of variance in Y that can be explained by variations in the levels of X.
We reduce uncertainty in Y by knowing X by R^2%
When we test the null hypothesis that rho ne 0, we need to perform Fisher transformations. Why?
When rho ne 0, the sampling distribution of r is not normal and SE is not easily estimated
So we transform r to r’, which is normally distributed around rho’ and has an estimable SE!
How would you calculate a confint around an observed correlation with Fishy transformations?
First solve for confidence limits on rho’ and then convert back to rho
What is the general effect of dichotomizing a continuous variable on a correlation?
Dichotomizing: e.g., median split into high group and low group on a continuous variable
This reduces correlation and power because you’ve made your effect smaller.
The more extreme the split, the lower r will be compared to what it would have been if we’d left the variable continuous
How is correlation affected by range restriction?
Restricting range results in lower correlation because there are fewer values at the extremes and we need values at extremes to get a strong correlation (they have higher Z scores!)
How is correlation affected by mixed (heterogeneous) populations?
This is when the groups vary in mean and variance
Combining heterogenous groups inflates the correlation and could bring about a spurious correlation! le gasp
How is correlation affected by outliers?
Correlation is super sensitive to outliers because they have big Z scores
What are the conventions for effect sizes for correlations?
Effect size conventions:
Small = .10
Medium = .30
Large = .50
Under what circumstances may you want to transform a Cohen’s d from an experiment into a correlation coefficient? What would such a correlation indicate?
We would do this if we wanted to compare effects from an experiment to an observational study – compare results of two different kinds of studies
What variances are between (treatment) variance and within (error) variance estimating?
Between (treatment) variance: estimates the total variance of all groups
Within (error) variance): estimates the variance within each group. Since we assume all groups have equal variance, this is an estimate of population variance
What are the factors that influence between and within variance estimates?
Within: affected by variability of each group, but UNAFFECTED by variability between groups, e.g., whether or not null is true
Between: affected by sampling error (error variance = within variance estimate) and variability between groups, e.g., whether or not the null is true
What’s the expected value of MSE and MSTreatment if null is true? If null is false?
If the null is true, we expect the estimates to be the same. If the null is false, MSTreatment will be larger.
What does sigma_t^2 refer to, and what role does it play in expected mean squares?
It is the variation of the true populations’ means, and it is a component of MStreat. If the null is true, then the means are all equal and do not vary. So when the null is true, it = 0 and causes MStreat to = MSerror