Midterms Flashcards
(20 cards)
What are the three approaches to an ANOVA
- Regression
- Comparison of ANOVA models
- Partitioning sum of squares
What is a one-way between-subjects design?
Groups are independent from one another and vary on a single factor
What are the assumptions of a one-way between-subjects ANOVA?
- homogeneity of variance: as long as the group with the largest variance is less than 4-5 times the group with with smallest variance and samples are equal and not too small it should be fine; if variances are too different adjust F cutoff score so that it’s more stringent
- normality: robust unless heavily skewed in opposite directions or skewed with small sample size or unequal sample size
- independent values between and within group
- When assumptions hold, ANOVAs is the most powerful omnibus test
Cookbook for hypothesis testing
- Establish null and alternative hypothesis
- Collect data
- Choose test statistic F with known distribution:
* F : between group variance / within group variance
(if null is true the ratio should be 1)
* F is always positive and the distribution is positively skewed
* F doesn’t indicate which means differ
* Distribution varies with the dfbetween, dfwithin - Calculate test statistic using observed data for F as fo
* But first you must check normality, homogeneity, independence
* When calculating F: SSbetween/dfbetween / SSwithin/dfwithin = MSbetween/MSwithin ~ F(dfbetween, dfwithin)
SStotal = SSbetween + SSwithin
* SStotal = sum of (xi - grandmean)^2
* SSbetween = sum of Nj(Mean of j - grand mean)^2
*SSwithin = sum of (xi - mean of j)^2
*dfbetween = # of groups - 1
*dfwithin = # of all n - # of groups - Calculate probability for p=P(F>=fo)
*critical score = qf(0.95, dfbetween,dfwithin)
*alpha = 1-pf(f score,dfbetween,dfwithin) - If p < alpha, reject the null
What are the advantages and disadvantages of partitioning of sums / regression / ANOVA model comparisons?
Partitioning of sums:
* Advantages: computational simplicity
* Disadvantages: Not easily transferable to advanced models, need to come up with new formulas all the time so poor conceptual integration, problematic for repeated measures/unequal n designs
Regression:
*Advantages: can use continuous and categorical variables as predictors
*Disadvantages: difficulty in dummy coding in factorial designs, hard to create contrasts, inflexible choice of error terms, does not generalize easily to repeated measures design
Model comparison:
* Advantages: can be generalized to advanced models, well suited for repeated measure designs, same basic formula for all designs
*Disadvantages: not as familiar
What are the components of GLM
sum of effects for other factors + sum of effects for allowed-for factors + baseline = observed value
What’s the df for model comparisons?
of independent observations - # of independent parameters
How do you calculate F in model comparisons?
(Ef - Er) / (dff - dfr) / (Ef/dff) = F
qf(0.95,dff,dfr) = cut off
alpha = 1-pf(F,dff,dfr)
In model comparison F what does the denominator represent?
weighted average of the within-group variances
what determines the adequacy of a model?
How much error must be increased for us to consider the restricted model (null) to be significantly worse than full model ((Er-Ef)/(dfR-dfF)) / (Ef/dfF)
What is Er - Ef equal to?
SSbetween
Why is it that you can get significant results with very small effect size if you have a large n?
Large n will lead to very small standard error estimates (denominator) even if the difference between the means (numerator) is small
what are ways to measure effect size
mean differences; estimated effect parameters; standardized difference between means; correlational measures
what are the standardized differences between means
cohen’s d, cohen’s f
they have CIs
Why is omega square smaller than R^2
R square can overestimate the strength so you correct for it by making numerator smaller and denominator bigger
When testing for homogeneity of variance and the group with small variance has large n, what happens to F value and error rate?
F values becomes inflated because the pooled estimate of the population variance in the denominator becomes really small so the F formula denominator becomes very small and increases your change of Type I error.
Ways to check normality and homogeneity of variance
Shapiro-wilk test, K-S test, skewness, kurtosis
Levene’s test, O’Brien’s test
Alternatives to consider if ANOVA assumptions are violated
Transform data, choose another analytical method that is more robust against assumption , use median instead of mean, remove outlier
What are the four building blocks to an experiment
UTOS
units, treatments, observations/measures, settings
List the construct validities
Statistical conclusion validity (validity of inferences of correlations/covariance), construct validity (validity of inferences about the higher order constructs) , internal validity (causality between A and B), external validity (validity of inferences about whether the cause and effect relationship holds over variation in context