analysis of variance tests

the null hypothesis that all groups/treatments have equal population means Ho: µ1 = µ2 = µ3 = ......

ANOVA compares

2 estimated components of variation MS_error, MS_groups

MS_error

Error mean square variation among samples in the same group- variance within group also MS_within

MS_group

Group mean square variation among samples that belong to different groups- variance between groups also MS_between

in null hypothesis is true

MS_error and MS_groups should be ~same F-ratio ~1

F-ratio

MS_groups : MS_error

MS_groups >> MS_error

F-ratio > 1 significant differences among the populations means, null hypothesis of no difference can be rejected

MS_groups not significantly larger than MS_error

null hypothesis cannot be rejected

ANOVA results p ≤ 0.05

at least one group differs from the others, does not tell us which group differs

If p ≤ 0.05

use post-hoc tests to find out which groups are significantly different from which others ex. Tukey-Kramer test

Squirrel study

red squirrel litter size decline w/ density due to: -reduced per capita food availability reduces fecundity -increased territorial interactions among individuals reduce surplus energy for reproduction

explanatory variables in squirrel study

Treatments- squirrel removal, food addition, habitat type

what were the levels of each treatment variable

squirrel removal (add, control) food addition (add, control) habitat type (douglas-fir, lodgepole pine)

ANOVA

Analysis Of VAriance

ANOVA uses what distribution

F-distribution to assess whether the calculated F-ratio is significant

t =

square root of F

simplest case of ANOVA

one-way/single factor ANOVA k ≥ 3, k = # of groups to compare 1 response variable, 1 treatment variable

response variable

litter size

is pseudoreplication an issue

there were multiple litter size measurements for each treatment, if we used every one that would be pseudoreplication, each of these points within one group are subsamples, we had to average them within each group

how to enter data

each column is a factor (treatment and response) each factor is coded (1,2)

How to run ANOVA

Stat- ANOVA- GLM- fit general linear model- resonse- mean litter- factors- habitat+food+squirrel model- all singles and combinations graphs- 4 in 1 storage- residuals

options dialog box

enter adjusted (type 3)

comparisons dialog box

enter pairwise comparisons activate Tukey, CL, test dialog boxes, post hoc?

is there any point in doing a post hoc?

if there are only 2 levels than probably not

storage dialog box

activate residuals

factor plots dialog box

factor plots dialog box

Factor plots

indication of strength of possible interactions enter variables in main effects plot box and interactions plot box

testing homogeneity

Stat- ANOVA- test for equal variances- response data- residuals (stored from original ANOVA) - factors- habitat/squirrel/food- Levene's test

How to state results

H_o: There is no effect of habitat on mean squirrel litter size H_a: There is an effect of habitat on mean squirrel litter size Result: (F = 100, DF = 1,17, P<0.001) Conclusion: Reject null hypothesis and conclude that habitat has an effect on mean squirrel litter size

how to split data up in excel

data- text to columns- delimited- next- comma- finish

interpreting output

F-value is F-ratio (error within group vs. between) P-values- which interaction(s) are significant R-squared- fraction of variation explained by groups Coefficients- response increases/decreases by that factor for each treatment/combination of treatments

R^2 =

SS_group / SS_total

R^2 = 0.43

43% differences among groups in light of treatment 57% is error, variance unexplained by explanatory

R^2 range

[0,1]

R^2 = 0

group means very similar, most variability is within groups

R^2 measures

fraction of variation in Y that is explained by group differences

R^2 = 1

explanatory variable explains most of the variation in Y

SS

separates 2 sources of variation in the data deviations btw each observation and groups mean deviations btw mean of groups and grand mean

MS_group =

SS_groups/df_groups df_groups = k-1 k is number of groups represents variation among sampled individuals belonging to different groups

MS_error =

SS_error / df_error df_error = N - k N = total # data points in all groups pooled sample variance, variation among individuals within same groups

variance ratio

F = MS_groups / MS_error

sources of variation

groups (treatments), error

mean squares

group mean square, error mean square

should we be concerned with small departures from normal

ANOVA is robust to deviations from normality, especially if sample size is large

Tukey-Kramer tests

one pair of means at a time

with only 2 levels of each treatment, is there any point in doing a post hoc comparison test?

no? because post hoc tests compare the means of every level.. we only have two to compare?