Flashcards in 8. Planned Comparisons and Pot Hoc Tests and Power and Effect Deck (87):
what does the ANOVA tell us?
that there is a difference somewhere between the means
how to we determine where the difference(s) are?
with a priori and Post Hoc comparisons
when do you decide an a priori test
before to test a specific hypothesis
when are post hoc comparisons made?
after assessing the F ratio
when should a priori tests be used?
if we have a strong theoretical interest in certain groups and have evidenced based specific hypothesis regarding these groups then we can test these differences using a priori tests
what sort of tests are a priori?
planned comparisons or t-tests
what do a priori seek to compare?
only groups of interest
when should post hoc comparisons be used?
if we cannot predict exactly which means will differ.
what should be done before doing a post hoc comparison?
the overall ANOVA to see if the independent variable has an effect
what does post hoc mean?
after the fact
what does post hoc comparisons seek to do?
compare all groups to each other to explore differences this comparing all possible combination of means
what are the characteristics of a post hoc comparison/
less refined - more specific
what is an omnibus
the initial f ratio
what are planned comparisons also known as?
what is weighting our group means?
we assign weights of contrast coefficients (c) to reflect the groups means (M) we wish to compare
what is the point of weighting our group means?
how we communicate with SPSS
how would we weight groups 1 and 2 when comparing them?
a weight (c_1) of 1 to mean group 1 (M_1)
a weight (C_2_ of -1 to mean of group 2 (M_2)
A wright of 0 to groups 3 and 4 as they are not in the analysis we are condicting
true or false
weights and contrasts are the same thing?
what must the sum of all coefficients be when weighting?
why must the sum of all coefficients be 0?
because this is SPSS's way of knowing that everything is fair and balanced. Groups (or sets of groups) which are being compared in a hypothesis must have equal, but opposite coefficients / weights
i.e. one group would be 1 and the other -1
what happens to the weights of groups when we are lumping them together in a hypothesis?
they must be given equal coeficcients of the same sign
what coefficient must the groups not being compared be assigned to?
what is the equation to test the significance of contrasts?
F_contrast = MS_contrast / MS_within
what is used for the error term in the F test for a contrast?
MS within from our ANOVA
how do we calculate the MS_contrast?
similar way to SS_between. the df is always 1
why is the df always 1 for a comparison F?
because we are only comparing two means (or two groups of means).
df = number of groups - 1
thus df = 2-1 = 1
what is the MS comparison the same as?
when is the difference between two means not significant?
when is the difference between two means significant?
F_observed > F_critical
wht are the assumptions for planned comparisons?
the same as the overall ANOVA:
all samples are independent
normality of the distribution
homogeneity of variance
How does SPSS help us overcome the homogeneity of variance assumption with planned comparisons?
when it runs the t-test for our contrasts it gives us the output for homogeneity assumed and homogeneity not assumed
If homogeneity is not assumed SPSS adjusts the df of our F critical to control for any inflation of type 1 error
what happens with error when we find a significant difference?
there is a chance we have a type 1 error
the more tests we conduct..?
the greater the type 1 error rate
what is the error rate per comparison (PC)?
the type 1 error associated with each individual test we conduct
what is the symbol of the PC error rate?
what is the error rate per experiment (PE)?
it is the total number of type 1 errors we are likely to make in conducting all the tests required in our experiment
what is the equation for error rate?
a x number of tests
so when we need to conduct a number of tests, what should we do about the rising type 1 error rate
Bonferroni Adjusted a level may be used
how does a Bonferroni Adjusted a level work?
divide a by the number of tests to be conducted (e.g. .05 / 2 = .025 if two tests are being condicted. then assess the follow up statistic sing the new a level
what happens when the overall F is not significant before performing a post hoc comparison>
STOP there and go back to the drawing board
what are the statistical tests that compare all means whilst controlling for type 1 error? (dont need to remember all of these)
LSD - keast significant difference
Tuckey's HSD - Honestly Significant Difference
Bonferonni adjusted comparisons
what is the most commonly reccommended statistical test that compares all means?
tuckey's because it is a good balance between power and error
what does a significant F not tell us?
how big the effect the IV has on the DV is
does not tell us how important this effect is
what is the significance of F dependent on?
the sample size and the number of conditions which determines the F comparison distribution
what is the statistic that summarises the strength of the treatment effect (IV)?
eta squared (η^2)
what does eta squared indicaate?
the proportion of the total variability in the data accounted for by the effect of the IV
what does SS_Between represent?
what we can explain (due to IV)
what does SS_Total represent?
what does the ratio of SS_betweeen / SS_total give?
the proportion of variance the IV accounts for in total amount seen in DV
what is the equation of eta squared?
η^2 = (t^2) / (t^2 + df)
this is the equal to SS_between / SS_total
what is the full equation of eta squared?
η^2 = [ (S^2_between) (df between) ] / [(S^2_between)(df between) + (S^2 within)(dfwithin)
which is also SS between / SS total
if a research article does not report effect size and we wish to calculate it, what is the equation that can be used?
[ (F)(dfBetween) ] / [ (F)(dfBetween) + dfWithin ].
what does eta squared provide?
the percentage of variability in scores is due to the effect of manipulating the IV
what are things to be aware of when measurign effect size for use in ANOVA?
It is a descriptive statistic not an inferential statistic so not the best indicatior of the effect size in population
It tends to be an overestimate of the effect size in the population
what does eta squared range from?
what is the scale that Cohen (1977) propose for effect size?
.01 = small effect
.06 = medium effect
greater than .14 = large effect
what is the statistic that provides the effect size for a comparison between two means?
what does Cohen's d tell us?
how many SDs apart we estimate the two populations means we're comparing to be. e.g. we compare the mean of the control group to each condition or lagest and smallest dose
what is the formula for Cohen's d?
• Cohen’s d = µ1-µ2/ population standard deviation
where is the SD obtained from when calculating Cohen's d?
the descriptive statistics for the group
what is a quick way to calculate the estimated population SD for Cohen's d?
square root of the error term (MS_within)
what is the scale for cohen's d?
Small effect = 0.20,
medium = 0.50,
large =0.8 0
what is the overall mathematical formula for Cohen's d?
Cohen’s d = M1 – M2 / √MS within
where do we obtain MS_within from?
what does a theoretically important IV only account for?
small proportions of the variability in the data
what does a theoretically unimportant IV account for?
a large proportion of variabolity in the data
when is eta squared used
when reporting f ratio
when is Cohen's d used?
when reporting t-tests or post hoc
what is the type 1 error rate when using p05?
why is replication of studies very important
in any experiment with any decision we dont know whether we have made a correct decision or an error
what does shift α level from .05 to .01 do?
reduces type 1 error but increases type 2 error
what is the most common way of achieving balance between these errors?
through estimating the power of the experiment
what is the definition of power
probability of finding a significant effect when one exists in the population
how is power conceptualised?
Power = 1- β
what is sensitivity?
the ability of an experiment to detect a treatment effect when one actuall exists
what is power?
it is a quantitative index of sensitivity which tells us the probability that our experiment will detect this effect
what does Keppel (1992) argue that the power should be?
greater than .8 to ensure an experiment can pick up a moderate effect
what is ensuring adequate power an issue of?
how can power be increased?
o Raising the a level (at the cost of more Type I errors) (raising alpha decreases beta so increases power)
o Reducing error variance (good design and measures)
o Increasing the sample size
o Increasing the number of conditions or groups
o Increasing the treatment effect size (good manipulations)
why not use the largest n possible?
o Not always cheap or easy to use large samples,
o We need to know what is the acceptable minimum sample size to pick up a specific effect.
what are the two main situations that we are concerned about power in?
o When we do not find a significant effect but there is evidence that we may have made a Type II error.
o When we are planning a new experiment and wish to ensure that we have adequate power to pick up the effect of our IV.
what should we do before running an experiment to give it adequate power (greater than .8)?
we should determine the sample size that will allow this
how can we estimate the required sample size?
by estimating the magnitude of the treatment effect
how can we find the magnitude of the treatment effect?
o past research
o a pilot study
o an estimate of the minimum difference between means that you consider relevant or important (often used in clinical experiments) .
what does a tiny effect size indicate?
that there is no evidence that IV has an effect
what does it mean when effect size is reasonable but power is low?
that there may be a power issue and could try to rerun the experiment with an adequate power