unit 3 - ch 12 - Distribution & 1-Way ANOVA Flashcards

1
Q

Analysis of variance

A

1 sample: Ho mew = #
2 sample Ho mew1 = mew2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

anova

A

3 or more sample means
Numerator of the variance is the basis of comparison

You can compare 2 at a time but we don’t want to because it inflates alpha
Compare water to tequila
Ho mewW = mewT
a = 0.05 → 5%
When you get down to all samples it turns a to a = 0.265 → 26.5%
Inflates alpha and increases chance of getting type 1 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

when does alpha inflate

A

You can compare 2 at a time but we don’t want to because it inflates alpha
Compare water to tequila
Ho mewW = mewT
a = 0.05 → 5%
When you get down to all samples it turns a to a = 0.265 → 26.5%
Inflates alpha and increases chance of getting type 1 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ANOVA primary advantage over 2 multiple sample tests:

A

ANOVA does not inadvertently inflate alpha
Keep a to 0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Assumptions for 1-way ANOVA (test assumptions)

A

The null is true
At least interval level data
The CLT is satisfied
Random and independent
The variances are equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are the variances equal

A

Exactly 2 samples
2 or more samples

Sample sizes can be unequal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

exactly 2 samples

s1 = s2
n1 = n2

A

TSEV (pooled)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

exactly 2 samples

s1 = s2
n1 =/= n2

A

TSUE (non-pooled)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

exactly 2 samples

s1 =/= s2
n1 = n2

A

TSUE (non-pooled)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

exactly 2 samples

s1 =/= s2
n1 =/= n2

A

TSUE (non-pooled)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 or more samples

sample sizes

n1 = n2 = nk

A

1 way anova = even if the variances are substantially unequal
= 4x - 5x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

sample sizes

not all n, are =

A

1 way anova = if the variances are substantially equal
= 1x-2x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

anova drink example

type of drink → ?
water, boba, energy, tequila
#s

A

→ Factor, classification, treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

anova drink example

type of drink
water, boba, energy, tequila → ?
#s

A

→Categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

anova drink example

type of drink
water, boba, energy, tequila
#s → ?

A

→ Criterion variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Factor →

A

qualitative (nominal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Categories →

A

qualitative (nominal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Criterion variable →

A

quantitative (interval or ratio)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

data variation - 1. Within (data is varied within water etc.)

A

Due to chance, randomness, error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

data variation - 2. Between (data is varied)

A

1st vs 3rd column etc
Due to factor, classification, treatment
Type of drink (ex)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Old Example - Wife and husband

A

Variation between husband and wife group and within wife and within husband
Vertical and horizontal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

the big picture

[ look at picture on docs ]

A

28 data points and create 4 samples combined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

the big picture

[ look at picture on docs ]
Variation is in the middle X double bar =

A

grand mean

24
Q

the big picture

[ look at picture on docs ]
(red dot) X =

A

we can measure this to X double bar = measure of variation

25
the big picture [ look at picture on docs ] X to x double bar =
SStotal Distance between these is two distances
26
the big picture [ look at picture on docs ] X to X bar (owns mean (x to x bar)) =
SSwithin Sum of the squares within Remaining distance is how x bar to x double bar
27
the big picture [ look at picture on docs ] X bar to x double bar = SSbetween
Sum of squares between
28
SSwithin (X to xbar)
Variation due to chance Sample with a lot of dispersion the length of sum of squares lengthens Sample with little dispersion the length of sum of the squares within shortens Square before we sum = 0 (don't want this so) SSwithin = sigma(x -xbar)^2 Sum of the squares (SS)
29
SSbetween (Xbar to X double bar)
Relationships samples have to each other If 4 sample are spread out then SSbetween lengths If 4 samples are stacked on top each other then SSbetween will shorten Relationship between samples control shape Sigma (xbar - xdouble bar)^2
30
Test stat =
between term / within term Divide total variation into these two
31
Underlying theory of anova
Total variation can be portioned into 2 distinct parts-- between and within -- and those 2 components can be compared to determine which is affecting the data to a greater degree
32
wide dispersion If between is big (num) and within is small (denom) = big test stat so between increase/ within decrease = TS =
reject the null
33
Tight dispersion = between small num/within big num = small TS =
fail to reject null small numerator Small distance between Large within Chance randomness and error
34
The HT in excel (with 1-way ANOVA) Step 1 = run HT
Ho mewW = mewVoba = mewTequila = mewEnergy Alternative is not this but with =/= This is WRONG some of these can be equal to each other H1 = not all population means are equal OR “at least one population means are equal” Wtv but H1 is a sentence
35
The HT in excel (with 1-way ANOVA) Step 2 = alpha
a = 0.10
36
The HT in excel (with 1-way ANOVA) Step 3 = Test Stat - F value
Single factor -> type of drink Variance SS > Between > Within > Total Formula for variance (X - xbar)^2 Sum of the squares aka numerator of the variance
37
Count x variance =
ANOVA (sigma(x -xbar)^2)
38
df, between =
n-1
39
df, within =
nt-k
40
df, total =
nt-1
41
df column is
additive
42
MS =
ss / df
43
F =
MSbetween = MSwithin
44
SS is
additive (as the lines show)
45
f-table requires
dfbetween and dfwithin
46
The HT in excel (with 1-way ANOVA) Step 4 = CV =
P value of F crit
47
The HT in excel (with 1-way ANOVA) Step 5 = decision
P < a = REJECT P > a = FTRN TS > CV = REJECT TS < CV = FTRN One tail-Right tail because the TS positive because everything is always squared (think of formula)
48
The HT in excel (with 1-way ANOVA) Step 6 = Summary
Not all population means are equal Rejected so different drinks impacts people’s ability to get through the levels of the game. Tequila did NOT impair people from progressing through the game; it actually made them better. That is because ___. Social game or game that rewards risk or rewards aggression Factor is the reason for variation (type of drink) FACTOR IS APART OF CONCLUSION NOT RANDOMNESS Look at means.. Tequila is higher than other means!!!!!!!! This is how we know the factors are not equal and we may be REJECTING THE NULL
49
Student EX: How big is F for F
B/W = TS TS is around 1 always positive 1!!!!! Further away = reject Close to TS = FTR Null Ho says that 2 groups compared are equal If they are equal TS = 1 Num and Denom can be big or small F distribution is skewed Df num and Df denom Steep decline because it can't go negative but can get larger
50
"Analysis of Variance" (abbreviated ANOVA)
For hypothesis tests comparing averages among more than two groups, statisticians have developed a method called
51
variances
The purpose of a one-way ANOVA test is to determine the existence of a statistically significant difference among several group means. The test actually uses variances to help determine if the means are equal or not.
52
In order to perform a one-way ANOVA test, there are five basic assumptions to be fulfilled:
Each population from which a sample is taken is assumed to be normal. All samples are randomly selected and independent. The populations are assumed to have equal standard deviations (or variances). The factor is a categorical variable. The response is a numerical variable.
53
F distribution
The distribution used for the hypothesis test is a new one. It is called the F distribution, invented by George Snedecor but named in honor of Sir Ronald Fisher, an English statistician. The F statistic is a ratio (a fraction). There are two sets of degrees of freedom; one for the numerator and one for the denominator.
54
To calculate the F ratio, two estimates of the variance are made.
1. Variance between samples: An estimate of σ2 that is the variance of the sample means multiplied by n (when the sample sizes are the same.). 2. Variance within samples: An estimate of σ2 that is the average of the sample variances (also known as a pooled variance). SSbetween = the sum of squares that represents the variation among the different samples SSwithin = the sum of squares that represents the variation within samples that is due to chance.
55
"sum of squares"
To find a "sum of squares" means to add together squared quantities that, in some cases, may be weighted. We used sum of squares to calculate the sample variance and the sample standard deviation in Chapter 2 Descriptive Statistics. MS means "mean square." MSbetween is the variance between groups, and MSwithin is the variance within groups.