Week 6 - Assumptions of ANOVA Flashcards
What is a type 1 error?
A type one error is falsely concluding that you’ve found something significant when in fact there was nothing going
on.
How do you know what the type one error rate is
The alpha that I said, that criterion in which I decide that something is significant is identical to my type one error rate.
if alpha is 0.05 my type one error rate it 5%
What is a type 2 error?
When you falsely accept the null hypothesis and reject the true research hypothesis and thereby conclude there’s nothing going on, whereas in fact there is something going on.
Type II error rate is signified by beta (β)
If you change your alpha level to lower what happens?
Decreases chance of getting a type 1 error but increases chance of getting a type 2 error
What are the three ANOVA assumptions?
- DV should be measured on a metric scale
- Independence of observations
- Normality of Distributions
- Homogeneity of variance
What does the assumption stating DV should be measured on a metric scale mean?
If your DV is not measured on a metric scale (if its not a true number) don’t use an ANOVA
metric scale = has to behave like a true scale –> have to be able to add, subtract and divide
What is the ‘independence assumption’
- States that it is not possible to predict one score in the data from any other score
-the only way you can know the probability of events happening is if you know they are independent of one another
- A requirement for calculating any p-value
In a between groups design, how do you assure the independence assumption is met?
- Random assignment of participants to groups (levels of IV).
- Random selection of participants from the population/s of interest (particularly important with some types of IV where random allocation is impossible).
- Each participant contributes only 1 score to the analysis (this may be the mean of many observations).
- Each participant’s score is independent – i.e. not influenced by any other participant’s score
What is the normality assumption
States that
- the samples are drawn from normally distributed populations and
- the error component is normally distributed within each treatment group (level of IV)
How can you check the normality assumption? (what are the assumptions you have to meet to consider data to meet the normality assumptions)
- See if there are a similar number of participants in each condition.
- See if there are at least 10-12 participants in each condition.
-Check the departure from normality (skewness or kurtosis) is similar in each condition (don’t want one condition massively positively skewed and another condition massively negatively skewed)
To see weather the normality assumption has been breached what is the best thing is to inspect?
frequency histograms for each experimental condition
For a completly normal population, what would your skewness value be?
0: normally distributed data (or any symmetrically distributed data)
- Positive/negative values: distribution skewed positively/negatively
For a completely normal population, what would your kurtosis value be?
0 (on ANOVA) normally distributed data (or any distributions that don’t have more outliers than normal distributions)
*In mathematical terms Kurtosis should equal 3 for a normal distribution, however SPSS subtracts the 3 to give 0.
What are the two types of tests on ANOVA that check normality assumption
Shaprio-wilks
Kolmogrove Smirdoff
What are outliers?
*Of more potential impact on our statistics than the shape of our distribution per se is the problem of outliers.
*An outlier is an extreme score at one or both ends of our distribution
Why are outliers a problem?
▪ Can inflate our measures of variance
▪ Can also affect our mean (drags distribution towards the mean)
▪ Potentially caused by spurious (incorrect) data
– e.g., eyeblink on the part of the participant or an error in the procedure
In an ANOVA do all your sample sizes need to be the same?
No this is a myth
could have 50 in one group and 150 in another if you wanted
How do you deal with outliers?
- First decide why it is an outlier… it may reflect an out-of-range value or a participant who is not part of targeted populations
Some solutions to problems of outliers:
* Remove them from data (common, but potential problematic solution)
- Transform data to remove the influence of outliers (make the outliers smaller)
- Use a non-parametric test (e.g., based on ranked data)
- Run analysis with and without outliers and see if they affect your results. If not report this and report ANOVA as usual
What is Homogeneity of Variance
-The assumption stating that all comparison groups have the same variance
- You need to check that the variances within your distributions are approximately equal so that it is safe to pool them
- A rule of thumb is that the largest variance should be no more than 4 times the smallest variance
- Breaches of homogeneity can affect the type I error rate
- Breaches of the homogeneity assumption are compounded by very unequal group sizes.
- There are a number of tests for breaches of homogeneity - e.g. Levene’s Test provided in SPSS
What does a a significant Levene’s test (p <.05) mean?
means variance in each group is significantly different and the homogeneity assumption is breached
*you want it to be non-significant so the assumption is met
How do you deal with breaches of the homogeneity assumption?
- If you have equal group sizes and breach is minor (i.e., largest group variance isn’t more than 4 × smallest), you can run an ANOVA as it is robust to minor breaches of homogeneity
- Run the ANOVA but use a lower alpha level to control for the possible impact on the type I error rate
- Use an alternate statistical test which does not have the homogeneity assumption (e.g., nonparametric test)
- Transform the data to remove the heterogeneity and run the ANOVA on thetransformed data
- Perform a robust test (e.g., Welch test or Brown-Forsythe test)
How does lowering the alpha level help deal with breaches in homogeneity?
➢ Breaches of homogeneity in ANOVA cause overestimation of the true value of F.
➢ This means we make more type I errors.
➢ Lowering the alpha level (e.g., .025 rather than .05) reduces the type I error rate.
➢ Thus, the effect of the breach of homogeneity can be reduced by using a lower alpha level.
What is a distribution Free test?
If you ever are stuck with one of these problems of normality or a problem of homogeneity of variants, look around for a distribution free alternative (also called nonparametric tests or distribution free tests)
These tests have less restrictive assumptions about the distributions used
How do distribution Free test/ nonparametric tests work)
- Most of these work by converting each score to a rank.
- Ranks are spread out evenly so the shape of the distribution will always be rectangular (= no normality assumption and no problem with outliers).
- There are specific rank-order tests for various hypothesis-testing situations.
- In rank order tests we are comparing ranks rather than scores.
- Some also compare medians rather means for describing group differences.
*converts a number (say a score) to a rank. So if you are ranked 2nd highest your number would be 2. Therefore they don’t have the ability to be large or small to drag the mean in one way or another