significance and power Flashcards
(23 cards)
What does NHST stand for?
Null Hypothesis Significance Testing.
What is the main goal of NHST?
To test whether an observed effect is likely due to chance.
What does a p-value tell us?
The probability of getting the observed results (or more extreme) if the null hypothesis is true.
What doesn’t a p-value tell us?
It doesn’t tell us the probability that the null hypothesis is true or measure practical importance.
What is a Type I error?
Incorrectly rejecting the null hypothesis when it is actually true (false positive).
What is a Type II error?
Failing to reject the null hypothesis when it is false (false negative).
What is statistical power?
The probability of correctly detecting an effect if one truly exists.
How is power calculated?
Power = 1 – β (where β is the probability of a Type II error).
What is the typical target for statistical power?
0.8 or 80%.
What are the three main factors that affect power?
Effect size, sample size, and alpha level.
What is effect size?
A standardised measure of the magnitude of an effect.
How does effect size relate to sample size and power?
Larger effect sizes need fewer participants to achieve the same power.
How does sample size affect power?
More participants generally increase power.
What is alpha (α)?
The threshold for the probability of a Type I error (commonly 0.05).
What happens to power if you lower alpha (e.g., from .05 to .01)?
Power decreases because it’s harder to detect a significant result.
What is the familywise error rate?
The increased chance of a Type I error when running multiple tests.
What is the Bonferroni correction?
A method to control for familywise error by adjusting the alpha level.
What is a one-tailed test?
A test that predicts the direction of the effect (e.g., Group A > Group B).
What is a two-tailed test?
A test that looks for any difference without specifying direction.
Which type of test is generally more powerful: one-tailed or two-tailed?
One-tailed tests (but they carry more risk if the effect is in the opposite direction).
Why are within-subjects designs more powerful than between-subjects designs?
They reduce variability because the same participants are used in all conditions.
What tool is commonly used for calculating power?
G*Power.
What can G*Power be used for?
To calculate power, required sample size, or effect size.