Lecture 6 Flashcards by Gabriel Deros

What are the ANOVA assumptions?

Normality
Homogeneity of variance
Independence of observations
DV measured on an interval or ratio scale
X (IV) & Y (DV) are linearly related

How well did you know this?

Not at all

Perfectly

Explain the normality assumption of ANOVA.

For any value of x (the IV aka the raw scores) are approximately normally distributed.
In other words, the raw scores are normally distributed with in each group. Do a frequency distribution of the raw scores for each group.

How well did you know this?

Not at all

Perfectly

What is the effect of violation of the normality assumption of ANOVA on type I and type II errors?

Type I error:

non normality only has a slight effect on type I error.
even for very skewed, or kurtotic (peakedness) distributions.
e. g. nominal alpha (what we set alpha at = type I errror when all assumptions met) vs actual alpha (type I error if one or more assumptions are violated)

How well did you know this?

Not at all

Perfectly

In really non-normal populations, when nominal alpha = .05, actual alpha = .055 or .06. If nominal alpha ~ actual alpha, what do we say?

We say F is robust to violations of the assumptions.

Therefore F is robust with respect to the normality assumption.

How well did you know this?

Not at all

Perfectly

What are the reasons that F is robust with respect to the normality assumption?

The sampling distribution of the mean will be normally distributed if:

a) the raw scores are normally distributed in the population.
b) The raw sores in the population are skewed, the sampling distribution of the mean will approach a normal distribution as n increases (n greater than or equal to 30 or so).

How well did you know this?

Not at all

Perfectly

Define standard error of the mean:

The standard deviation o the sampling distribution of the mean.

How well did you know this?

Not at all

Perfectly

When would you use a non-parametric test? Why?

When the population is very skewed. Because non-parametric tests are distribution free, which means they don’t have the normality assumption.

How well did you know this?

Not at all

Perfectly

What effect does lack of normality have on power?

Only a light effect (a few 100ths)

- Lack of normality due to platy kurtosis (flattened distribution) does affect power, especially if n is small.

How well did you know this?

Not at all

Perfectly

How does one check for normality?

Check via frequency distributions

- If big violation of normality with small n –> conduct a non-parametric test –> i.e. distribution doesn’t matter.

How well did you know this?

Not at all

Perfectly

What are some examples of non-parametric tests?

Chi square 
Mann whitney
Wilcoxon
Kruskal-Wallace
Friedman

How well did you know this?

Not at all

Perfectly

Describe the Homogeneity (homoscedasticity) of variance assumption

i.e. variance (refers to error variance aka within group variance) is unaffected by the treatment i.e. the IV
i.e. MSerror
MSwithin
S/A
error due to chance
variability due to chance
etc…
i.e. σ²1 = σ²2 = σ²3 etc
In other words, for every value of x, the variance of y is the same.

How well did you know this?

Not at all

Perfectly

Illustration of heteroscedasticity

Scores (y axis) and independent variable (x axis)

Each group’s scores grouped together above each group

How well did you know this?

Not at all

Perfectly

Under what circumstances is F robust for unequal variances?

If n’s are equal or approximately equal.

How well did you know this?

Not at all

Perfectly

When is heterogeneity of variance an issue?

Only an issue if:

- n’s are sharply unequal and a test shows that the variances are sharply unequal.

How well did you know this?

Not at all

Perfectly

What is meant by approximately equal n?

largest n/smallest n < 1.5

How well did you know this?

Not at all

Perfectly

What is meant by approximately equal σ²? (variance)

Study These Flashcards

Largest variance/smallest variance > 3
If ratio is greater than 3, we have sharply unequal σ².
If Fmax > 3, then the variances are sharply unequal

When is heterogeneity an issue for type I error? (case 1)

Study These Flashcards

Case 1: If the largest variance is associated with the group with smallest n
F is liberal. i.e. actual alpha is going to be greater than nominal alpha.
i.e. falsely reject H0 too often
Solution: adjust nominal alpha downwards. e.g. .025 –> therefore actual alpha is approximately .05

When is heterogeneity an issue for type I error? (case 2)

Study These Flashcards

Case 2: If largest variance is associated with the group with largest n
F is conservative.
i.e. actual alpha is less than nominal alpha.
So people usually don’t make an adjustment.

Explain the independence of observations assumption of ANOVA

Study These Flashcards

Observations within each group are independent of one another.
Usually satisfied if unrelated subjects run individually and alone.
Usually satisfied if subjects run individually and alone.
REALLY IMPORTANT*

Why is the independence of observations assumption of ANOVA so important?

Study These Flashcards

Because even small violations have a substantial effect on both alpha and power.

How is dependence measured?

Study These Flashcards

Intraclass correlation

Explain the DV is measured on an interval or ratio scale assumption of ANOVA

Study These Flashcards

Check definitions against actual DV used.

If DV is nominal or ordinal, conduct a different type of statistical test. e.g. Chi square test

Explain the X (IV) & Y (DV) are linearly related assumption of ANOVA

Study These Flashcards

i. e. a subject’s score is comprised of 3 parts:
1. general effect (grand mean)
2. an effect that is unique and constant within a given treatment.
3. An effect that is unpredictable (random error & individual differences).

Give the linear model of the fifth assumption for ANOVA

Study These Flashcards

μ + alphaj + eij
where μ = grand mean
alphaj = treatment effect for the jth group
and eij = random error for the ith subject in the jth group
so:
general effect + treatment effect + error

Define an outlier

a data point which is very different from the rest of the data. Outliers can have a dramatic effect on results

When removing outliers, what must be done?

Must explain why they were removed, this information must be shared.

What causes outliers?

1. Human error (eg data entry) 2. instrumentation 3. Subjects significantly different from the rest of the sample --> perhaps from a different population. Therefore need to detect and remove outliers.

How do you detect outliers for small samples?

- The largest possible z score of a data set is bounded by: (n-1)/√n eg. for n=10, largest possible z score is 2.846, therefore for small samples, scrutinize any data point greater than or equal to z=2.5

How d you detect outliers for large, normally distributed samples?

- Approximately 99% of scores are within three standard deviations of the mean. Therefore z scores >3 should be scrutinized Note: If n>100 will get some z scores >3 by chance. A criteria of z>3 is also reasonable for non-normal distributions, but could extend it to z>4.

What happens when subjects are run after analyses?

Tends to increase variability, therefore decrease probability of finding significance AND if N is really large, tend to get statistical significance no matter what, even if no practical significance.

Why does running participants after analyses and having a very large N get statistical significance almost no matter what?

Because as N increases, standard error of the mean (sigmaYbar) decreases.

Lecture 6 Flashcards

(31 cards)