Week 7 - planned comparision and post hoc tests Flashcards
Why does the F ratio not paint the whole picture?
only tells there is a difference somewhere between the means We need an analysis that helps to determine where the difference(s) are
What are the two basic approaches to comparisions?
- A priori (or planned) comparisons
- Post hoc comparisons
What is a priori (or planned) comparisons
- If we have a strong theoretical interest in certain groups and have evidence-based specific hypothesis regarding these groups then we can test these differences upfront
- Come up with these before you do your study
- Seek to compare only groups of interest
- No real need to do the overall ANOVA we do it because of tradition. Hence, reports often start with the F test and progress to planned comparisons
rather be in a prior hypothesis then be in post hoc
What is a Post hoc comparisons?
- If you cannot predict exactly which means will differ then you should do the overall ANOVA first to see if the IV has an effect, then
- Post hoc comparisons (post hoc = after the fact/ANOVA)
- seek to compare all groups to each other to explore differences.
- Less refined – more exploratory.
What are the two types of A priori/ Planned comparisons
Simple
Complex
What is a simple a priori comparison?
comparing one group to just one other group
What is a complex a priori comparison?
comparing a set of groups to another set of groups
*In SPSS we create complex comparisons by
assigning weights to different groups
How to conduct a priori comparison (how to weight it)
Create 2 sets of weights
- 1 for the first set of means
- 1 for the second set of means
- Assign a weight of zero to any remaining groups
- Set 1 gets positive weights
- Set 2 gets negative weights
- They must sum to 0
A simple rule that always works –> The weight for each group is equal to the number of groups in the other set
What are the assumptions of a priori/ planned comparisons
- Planned comparisons are subject to the same assumptions as the overall ANOVA - particularly homogeneity of variance as we use pooled error term.
- Fortunately, when SPSS runs the t-tests for our contrasts it gives us the output for homogeneity assumed and homogeneity not assumed
- If homogeneity is not assumed SPSS adjusts the df of our F critical to control for any
inflation of TYPE 1error
What are Orthogonal contrasts
- One particularly useful kind of contrast analysis is where each of the contrasts tests something completely different to the other contrasts
Principle:
Once you have compared one group (e.g., A) with another (e.g., B) you don’t compare
them again.
Example
Groups 1,2,3,4
Contrast 1 = 1,2 vs 3,4
Contrast 2 = 1 vs 2
Contrast 3 = 3 vs 4
Cool things about orthogonal contrasts
- A set of k-1 orthogonal contrasts (where k is the number of groups) accounts for all of the differences between groups
- According to some authors a set of k-1 planned contrasts can be performed without adjusting for type-I error rate
Post-Hoc comparisons
- Let’s say we had good reason to believe that sleep deprivation would impact performance but did not know at exactly what level of sleep deprivation this would occur. So, we had no specific hypothesis about what difference would emerge between which conditions.
- In this case, planned comparisons would not be appropriate
- Here you would perform the overall F analysis first
- If overall F is significant, we need to perform post-hoc tests to determine where the differences actually are
What do post hoc comparisions seek to compare?
Post-hoc tests seek to compare all possible combinations of
means
* This will lead to many pair-wise comparisons
* e.g., With 4 groups, 6 comparisons
* 1v2, 1v3, 1v4, 2v3, 2v4, 3v4
How does post hoc comparisions increase the risk of type 1 errors?
- So, as we know when we find a significant difference there is an alpha chance that we have made a Type I error.
- The more tests we conduct the greater the Type I error rate
What is the error rate per experiment (PE)
the total number of Type 1 errors we are likely to make in conducting all the tests required in our experiment.
* The PE error rate <= x number of tests
* <= means it could be as high as that value
How do you restore a type 1 error rate back to .05 (5%) when conducting multiple tests
So when we need to conduct several tests, what should we do about the rising Type I error rate?
* If many tests is required, then a Bonferroni Adjusted
apha level may be used
What is a Bonferroni adjustment?
- Divide by the number of tests to be conducted (e.g., .05/2 = .025 if 2 tests are to be conducted).
- Assess each follow up test using this new level (i.e. .025)
- Maintains PE error at .05 But this will reduce the power of your comparisons a lot!
Remember as we decrease alpha (by making our test more conservative) we also decrease power (chances of detecting a true effect)
What are alternatives to the Bonferroni test (alternatives to reducing type 1 error rate)
- There are several statistical tests that systematically compare all means whilst controlling for Type 1 error
- LSD - least significant difference (actually no adjustment) (where you just ignore the problem -> not recommended)
- Tukey’s HSD - Honestly Significant Difference, popular as best balance between control of EW error rate and power (ie Type 1 V Type 2 error)
- Newman-Keuls: gives more power but less stringent control of EW error rate
- Scheffe Test most stringent control of EW error rate as controls for all possible simple and complex contrasts
- And many others you can find out about at your leisure
What is the best one of these tests to use?
Tukey’s test is very common and recommended.
What to do with post hoc tests (when do you use them and how)
- If your hypothesis predicts specific differences between means;
- Assess assumptions
- Perform ANOVA
- Consider what comparisons will test your specific hypotheses
- Perform planned comparisons needed to test these predictions
- If your hypothesis does not predict specific differences between means;
- Assess assumptions
- Perform ANOVA
- If ANOVA is significant then perform post-hoc tests
- If ANOVA is not significant then don’t do post-hoc tests
What is a meta analysis?
When a researcher finds a lot of papers in the literature about a specific topic then you take their individual statistics and put it in a spreadsheet and then you aggregates these statistics and do a statistical test on this aggregated data
Effect size philosophy
A significant F simply tells us that there is a difference between means. I.e., that the IV has had some effect on the DV
* It does not tell us how big this difference is.
- It does not tell us how important this effect is.
- An F significant at .01 does not necessarily imply a bigger or more important effect than an F significant at .05.
- The significance of F is dependent on the sample size and the number of conditions which determines the F comparison distribution
What does effect size tell us?
If I took the overall variability in my criterion variable (example here target accuracy) How much of that variability could I explain on the basis of how much sleep deprivation you’ve had?
summarizes the strength of the treatment effect:
- Eta squared (n2)
- Indicates the proportion of the total variability in the data accounted for by the effect of the IV.
what does n2 tell us
- This result says that __% of the variability in errors is due to the effect of manipulating whatever our IV is
For example, one could say that 65% of the variability in errors is due to the effect of manipulating sleep deprivation.