Lecture 10 Flashcards
(17 cards)
omnibus effect
if a significant
difference exists across the dataset as a whole
does random noise exist between columns and within columns?
In other words,
whatever “random noise” exists between columns also exists across rows (i.e., within columns)
in a completely random data set yes, the ratio is expected to be around 1
what happens if we increase all data in a completely random data set one row by 2
variance between columns is now increase whereas the variance within columns is unchanged
values are no longer random
Between column variability
how the columns differ from one another
analysis of variance
ANOVA
statistical test where two or more groups can be
analyzed by taking the ratio of between group variance over within group variance
Within column variability
how scores within each column vary from one another
ANOVA ratios if H0 or H1
If the null is true, this ratio is expected to be = 1. If the null is false, this ratio is expected to be > 1
When the null hypothesis is false, the existence of an effect will alter at least one of the group
means, but the variance within that group will be unchanged if the transformation is linear (between-group variance will change, but the within-group variance will not)
anova s^2 if H0 or H1
when null is true: s^2 between and s^2 within are expected to be equal
when null is false: s^2 between > s^2 within
f ratio
between-to-within group variance ratio.
F distribution is derived from the t distribution. Specifically, it is the t distribution squared. The
mean of F = 1, and the mean of t = 0
ANOVA assumtions
- Data is normally distributed for all conditions
- There is homogeneity of variance
- There is independence of observations
windsorizing
replacing extreme data points with less extreme values from the dataset, typically using percentiles
t vs f
t and F are functionally equivalent. If a dataset with group groups is calculated as either a t test or
an ANOVA, the obtained F value will equal the t2, and the critical F value will equal the critical t
- but anova can do more than 2 groups
source table
breaks up the model into its
individual source components
here, the between and within s2 (described as mean square, or MS)
are also split into their SS and df components
anova and multiple regression relationship
ANOVA is to t what multiple regression is to Pearson’s r.
partial correlation
completely control for the relationship
semi-partial correlation
control for the
relationship for one of the variables)
things to consider when having more than one predictor variable
- these predictor variables may be related to one another. If they are correlated with one
another, and thus not independent, it can affect the overall regression model. In this case, partial
correlation and semi partial correlation can be used to piece apart the influence of overlap - it is important to ensure that all X variables are linearly related to the predictor variable, as
well as linearly related to one another