Flashcards in final exam Deck (109):
The match‐paired samples t‐test is the most powerful analysis for a 2‐sample design if in fact the samples are correlated
the population variances of the repeated measurements are equal the population correlations among all pairs of measures are equal Violation of the assumption of sphericityis serious: it results in an increase in the Type I error
The results of this table show that the hypothesis of sphericity was not rejected (p= .073 > .05). Therefore, we can conclude that the sphericityassumption was met. If the assumption of sphericityis not met (e.g., p
other assumptions repeat measures
Normality Repeated measures designs (within‐subjects ANOVA) assume that the scores in all conditions are normally distributed Independence It is NOT assumed that the scores of a given subject are independent of each other since the whole point of the analysis is that they are dependent
A general purpose test for use with discrete/nominal variables Focus on the number of different categories Categories have no order relation (larger/smaller) to each other (e.g., male/female; university major)
2 chi square tests
Two kinds of Chi‐square tests Distribution shape tests Goodness of fit test or one way classification test Homogeneity test Independence tests Or contingency table tests (a*b tables) These are often incorrectly called ʺassociation testsʺ (they are really another kind of goodness of fit test) or ʺTwoway classificationʺ tests Both two kinds of tests compare the observed frequency of categories with an expected frequency of categories The expected frequencies are usually derived from the
d‐family based on one or more measures of the differences between groups or levels of the independent variable r‐family some sort of correlation coefficient between the two independent variables
**As degrees of freedom increase, a smaller V is interpreted as a larger effect.
Effect size, R family measure
not meaningfull, Phi; use d family measure, odds ratio...
Assumptions mixed design
Normality for the between‐subjects IVs Homogeneity of Variance
Independence… So we need to test for sphericity
When and Why do we use ANCOVA? To test for differences between group means when we know that an extraneous variable affects the outcome variable Used to control known extraneous and confounding variables Advantages Reduces Error Variance By explaining some of the unexplained variance (SSerror), the error variance in the model can be reduced Greater Experimental Control By controlling known confounds, we gain greater insight into the effect of the predictor variable(s)
In the study, the measurement of math aptitude test is the covariate variable
The ANCOVA assumes a linear relationship between the covariate and the dependent variable and there is no interaction between the covariate and treatments
A one‐way ANCOVA was conducted. The independent variable, teaching methods, included three levels: Method A, Method B, and Method C. The dependent variable was Math Achievement score and the covariate was the math aptitude test score
When do we use nonparametric procedures?
When normality cannot be assumed When data cannot be transformed to normality When methods based on other non‐normal distributions are not available or appropriate When there is not sufficient sample size to assess the form of the distribution
Generally assumes a random sample from a normal distribution
Generally measured on interval or ratio scale s
Often requires large sample sizes (e.g., n >30) to appeal to normality
Few assumptions about the population distribution
May be measured on categorical, ordinal, interval, or ratio scales
Can be small sample
A statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample, most often with the purpose of deriving robust estimates of standard errors and confidence intervals of a population parameter like a mean
When the data set is large (e.g., n > 100) it often makes little sense to use nonparametric statistics at all because when the samples become very large, then the sample means will follow the normal distribution even if the respective variable is not normally distributed in the population
If you have a significant interaction for a 3-way anova you should ignore any main effects you might have.
False, it is unclear what the interactions are.
Unequal sample sizes in a factorial analysis of variance are dealt with just like equal sample sizes for hand calculations
Three main effects and three interactions are tested in a factorial design with three independent variables
False; three IDV have 4 interactions. AB, BC, CA, ABC...
If we have a repeated measures design with subjects receiving four levels of a treatment, we assume that the correlations among the levels will be about the same
False, sphericity assumptions. covariances are not zero.
Changing a 1-tailed test to a 2 tailed test will affect power. 2
Type 1 error
β, is the probability of falsely rejecting the null hypothesis when it is true.
In a one-way ANOVA,
SS treatment refers to the variation due to the
SS error refers to what you dont know
After collecting a measurement of blood pressure from two different treatment groups (medicated and control), you should run a dependent samples t test.
After administering a coffee supplement, you measure resting heartrate in your participants. You know that the average resting heartbeat for the population is M=71, σ=1.2 and should therefore run a one sample t test with your data
positively skewed, not symmetric.
The relationship between two variables in the population must be linear to run regression.
effect size chi square, correlation between two variables.
for repeated measures anova
ANOVA can test the difference between multiple means, unlike T tests which can only test the difference between two means.
The unit of measurement for nominal data is frequency
True, whole numbers
A sampling distribution is a frequency distribution.
13. T it is a frequency distribution of sample means.
The shape of an F distribution is impacted by two degrees of freedom (between and within).
14. T F distribution has both, and you use these to find the critical value, which tells us significance.
Nonparametric procedures are used when normality cannot be assumed and the data cannot be transformed to normality.
In an ANOVA, the degree of variability in the populations do not have to be equivalent as long as the sample size is large and the sample groups contain around the same number of participants.
17. T If the sample is large enough, the population variability doesn’t matter. It is the homogeneity of variance for ANOVA. Power, then, would have to be large.
The values on a chi square distribution are always positive.
chi square assumption
Independence of cases b.
Expected frequencies c.
After conducting a z test,
it tells you how many SD you are away from the mean.
They expect that 60% of 1st year students, 50% of 2nd year students, 30% of 3rd year students, and 20% of 4th year+ s
chi-square, because you are comparing expected and observed frequencies.
Researchers are looking at the effect of time spent watching TV on perceived happiness. They have one group of 5 people watch 3 hours of TV per day for 2 weeks and another group of 5 people watch 1 hour of TV per day for 2 weeks. After 2 weeks they administer a questionnaire that measures overall happiness.
independent samples t test
If the researchers added a third group of 5 people that watched 6 hours of TV a day, they should then use what type of test
one way anova
While running a dependent-sample t test you must assume that the matching between the samples is real, otherwise known as the assumption of;
correlation; you have to assume that participant A in condition one, is participant A in condition two.
The following tests do NOT assume homogeneity of variance;
chi-square, because you don’t have a sampling distribution. It is only frequencies for chi-square.
Regression Analysis is not robust to the violation of the assumption of;
linearity, and expected frequencies are for chi-squared not for regression.
The Homogeneity of Slopes is an assumption associated with which of the following tests
Chi - square assumption
Each sample is a random sample from its population
Considered inappropriate to conduct if violated, but some argue it is robust if violated
Independence of cases
All cells must have expected frequencies of at least 5, or at least 5 times as many individuals as categories (or cells)
omega squared w 2
this is a better estimate of the percent of overlap in the population Corrects for the size of error and the number of groups
The proportion of variance accounted for by the regression model
Multiple R2 is the equivalent of eta‐squared Adjusted R2 is the equivalent of omega‐squared
Two branches of statistics
Descriptive statistics Summarize and describe a group of numbers Inferential statistics Try to infer information about a population by using information gathered by sampling
The standard normal distribution
, which has a mean of 0 and a standard deviation of 1. Such a distribution is often designated as N(0,1),
Why use Z scores?
Can compare distributions with different means and SDs Used to “STANDARDIZE” variables – allows variables to be compared to one another even when measured on different scale
Terms used to describe distributions
Kurtosis (heavy or light tailed)
Modality (uni-, bi-, or multi-modal)
Normally distributed if Z < Zα= .05 = 1.96
Here are factors that affect the power of a test
the size of alpha
1 or 2 tailed test
separation of mu (Effect Size?)
the size of the population variance, standard deviation
the sample size,
df” is a mathematical concept that involves the amount of freedom you have to substitute various values in an equation
In general t‐scores are not normally distributed, but the distribution of t‐scores is symmetric and has a mean of zero Varies in shape according to the degrees of freedom. The discrepancy between the t distribution and the zdistribution gets worse as “n” gets smaller.
With n ≥ 30 observations, t‐distribution approximates z‐distribution
pearson's r assumption
Linearity; The relationship between the two variables in the population is a linear on
The residuals at each level of the predictor should have the same variance.
Not as big of a deal if violated
designed to correct for errors introduced when the population SD is est from sample.
has nothing to do with the null being true or false
E = R/N (C) :
Divide the total for the row by the total participants, then multiply by the column total.
Odds vs risk:
Risk is the variable divided by the total participants within that treatment. The odds is slightly different, it is the variable divided by the total participants who were unaffected by the variable. Ex. Odds would be heart attack/no heart attack, and risk would be heart attack/ total
*Odds ration and risk ratio, the control group is always on top. ex, rr no asprin/ rr asprin
Anova is the analysis of variance:
it deals with differences between and among sample means. BUT it imposes no restriction on the number of means. And it allows us to deal with 2 or more independent variables at the same time.
Assumptions of one-way
*Homogeneity of variance (homoscedasticity): that each of our populations has the same variance. Therefore (e) the error variance, is unrelated to any treatment differences
Normality: scores on DV are normally distributed around their mean. In other words, that error is normally distributed within conditions.
Independence: is that observations are independent of one another. This demonstrates why it is important to randomly assign subjects to groups.
MS error or MS within:
this estimate doesn’t depend on the truth of falsity of null hypothesis.
MS treatment or MS between:
is an estimate of STDV as long as the null is true. Thus, if the two estimates agree we will have support for the truth of the null and if they disagree we will have support for the falsity of the null. Thus, if the two estimates agree, we have no reason to reject the null.
the difference between the mean treatment and the grand mean.
DF: between treatments is always k-1,
where k is the number of treatments. DF error is the treatment subtract the total. Or it is the sum of the degrees of freedom within each treatment.
F is got by dividing MS treat / MS error.
A ratio close to 1, means the MS are measuring the same thing. Thus, you don’t reject the null. The F statistic with a critical value of alpha = 0.05 allows us to reject the null if the F stat is under 0.05. Rejecting the null, means that we reject that the treatment means and population are equal.
*A significant F
means that not all population means are equal.
Missing at random:
someone drops out of experiment. All you have to do is reduce power.
** If the null is true,
then MS error and MS treat are estimating the same thing, thus they would be equal.
Noticeably unequal sample sizes
make experiment lest robust to heterogeneity of variance.
has advantages for both power and protection against type 1 errors.
**The fact that an analysis of variance has produced a significant F
simply tells us that there are differences among the means of treatments that cannot be attributed to error.
N squared eta squared:
SS treatment/SS total. BUT because the means are sample means, subject to sampling error thus eta is biased. Eta is the most biased because of this assumption.
W squared / omega squared:
w is less than n because n is biased.
Error rate per comparison:
probability of type 1 error on comparison. For ex, if we have a t-test between two groups, reject the null because of t exceeds the t 0.05, then we are working at a per comparison rate of .05.
A priori comparisons (t-tests) :
multiple t-tests : t-test allow us to compare one group with another group, whereas linear contrasts allow us to compare one group or a set of groups with another group or set of groups.
Bonferroni t: based on Bonferroni inequality,
which states that the probability of occurrence of one or more events can never exceed the sum of their individual probabilities. This means that if we make three comparisons each with .05 probability, the type 1 probability can never exceed .15.
the Bonferroni t
runs a regular t test but evaluates the result against a modified critical value of t that has been chosen so as to limit the FW.
*Bonferroni could be used for logistic regression and for analysis of variance.
compare one treatment with each of the other treatment means. But it lacks the flexibility of other tests.
Conditional/ simple effect:
the effect of one factor for those observations at only one level of another factor.
*The more effects you test,
the higher the familywise error rate will be.
*A failure to reject the null
does not mean that the means are equal, it just means that they are not sufficiently different for us to know which one is larger.
*As a general rule
a factorial design is more powerful than a one-way design only when the extra factors can be thought of as refining or purifying the error term. Therefore, when extra factors account for variance that would normally be incorporated into the error term, the factorial design is more powerful.
*When you add a factor that is random,
you may decrease power.
the IDV and DV are the same across time, gender, old, young, therefore sampling error would change the F value here.
*Normally when we are examining an omnibus F
we use an r-family measure (eta square).
Eta, is a good descriptive statistic, but poor inferential statistic.
The theory for w square
allows us to differentiate between fixed, random, and mixed models.
when we are looking closely at differences among individual groups.
*Advantage of repeated measures design
is that they allow us to reduce variability by using a common subject pool. Thus we can remove subject differences from our error term.
When cells are independent, the covariance is always zero, this is why we have the homogeneity of variance. But with repeated-measures covariance is not zero, so we have to assume that they are equal. So when we have sphericity, the Fs are valid.
for assumption of sphericity.
Repeat measures has two problems:
it assumes compound symmetry, and complete data. If a participant misses one testing session, but appears for all others, he has to be eliminated from study. The alternative approach to avoid these problems would be Mixed models.
Pearson’s chi square,
it is based on the X squared distribution.
to ask whether the deviations from what would be expected by chance are large enough to lead us to conclude that responses weren’t random
are the frequencies that you would expect if the null hypothesis were true.
Evaluation of X squared:
with 1 df the critical value of X squared as found in appendix X is 3.84. Because our value of 7.71 exceeds the critical value we will reject the null hypothesis that the variables are independent of each other.
On the other hand, if our critical value was 0.324 it falls below 3.84, and thus we don’t reject the null that the variables are independent of each other.
One or two tail:
If we think of X squared, it is a one tail test where the null is rejected if our X squared is in the extreme tail of the distribution.
chi square weakness
A critical weakness of chi-square is that it is nondirectional.
From SPSS output you look at R adjusted.
R squared is correlation coefficient. Regression effect size. You’re using real numbers to predict. R tells you how related the two scores are.
E = R/N (C) : Divide the total for the row by the total participants, then multiply by the column total.
Madjg = Mg –b * (Mcg –Mct).
Note: Madjg the mean of the DV for each group; Mg is the mean of the DV for each group; Mcg is the mean of the covariate for each group; Mct is the mean of the covariate for all groups
3‐way ANOVA means that there are three IVs and one DV How many main effects?
How many interactions?
Three 2‐way interactions
One 3‐way interaction