One-way ANOVA Flashcards
Differences in mean weight between 2 cattle breeds?
cattle ## # A tibble: 30 x 2 ## breed weight ## ## 1 Breed1 188. ## 2 Breed2 148. ## 3 Breed1 180. ## 4 Breed2 146. ## 5 Breed1 199. ## 6 Breed2 153. ## 7 Breed1 191. ## 8 Breed2 135. ## 9 Breed1 196. ## 10 Breed2 151. ## # ... with 20 more rows
Two sample t-test
fit
Code for manual calculation of mean and varience?
cattlesummary % # using the cattle data, group_by(breed) %>% # group by the breed variable, then # calculate the mean and sd per group: summarise(mean_wt = mean(x = weight, na.rm = TRUE), sd_wt = sd(weight, na.rm = TRUE)) cattlesummary ## # A tibble: 2 x 3 ## breed mean_wt sd_wt ## ## 1 Breed1 196. 10.6 ## 2 Breed2 154. 12.3
Model assumptions (2-sample t-test)
- equal variances
- normality
- independance of observations
Model assumptions (Normality) code
- ggplot(cattle, aes(breed, weight)) +
geom_boxplot() - hist(cattle$weight)
- shapiro.test(cattle$weight)
## Shapiro-Wilk normality test ## ## data: cattle$weight ## W = 0.93704, p-value = 0.103
If p > 0.05, the distribution of the cattle data is not significantly different from a normal distribution, i.e. we can assume normality.
What happens if assumptions are not met?
- equal variences –> perform test with unequal variences (Welch t-test)
- Normality –> if N>30 assume normality anyway (Central Limit Theorem)
- Independence of observations –> if not independant, use paired t-test
What do we need to consider before running a t-test
- differences between the treatment effects (e.g. the difference between 4 diets of chickens)
- differences within the treatment effects (e.g. differences within each chicken diet)
When do you use a 1-way anova?
When there is only 1 factor
e.g. chicken diet (4 diet options) but only “factor” is the diet
Code for normality (model assumption)
chicks % pivot_longer(cols = starts_with("Diet"), names_to = "diet", values_to = "weight") %>% mutate(diet = as.factor(diet))
ggplot(chicks, aes(diet, weight)) +
geom_boxplot()
hist(chicks$weight)
shapiro.test(chicks$weight)
Code for equal variences (model assumptions)
bartlett.test(weight ~ diet, data = chicks)
How are the concepts of ANOVA divided into components?
Total Sums-of-Square (SS) = Treatment SS + Residual SS
ANOVA code in R
model
If null hypothesis is true what does that mean for the F statistic?
It should have a value around 1
What does a large F value suggest?
Null hypothesis is false
emmeans plots (ANOVA)
library(emmeans)
emm