Revision - Everything after lecture 5! Flashcards
(62 cards)
What does ANOVA stand for?
analysis of variance
What can you do if data is not normal and you still want to use a parametric test?
Log 10(x)
If any values are zero do
Log10(x+1)
What is the convenient form of variance?
Sum of squares (SS)
What are sums of squares?
The sum of squared deviations from the mean. (The more values the bigger the SS) e.g. 2, 5, 11 Mean is (2+5+11)/3 = 6 Deviations from 6 are -4, -1, +5 Squared deviations are 16, 1, 25 Sum of squares is 16+1+25=42 The SS for 2, 5, 11 is 42
How do we account for the number of x values in the sums of squares? (standardise)
The mean square:
The sum of squares divided by the degrees of freedom
Taking into account the sums of squares, how do we calculate analysis of variance (ANOVA)?
SS of all numbers =
SS within samples + SS between samples
What does the F statistic test?
to find out of the variance is greater than we would expect from the variance within samples.
If the variances are equal, F = 1
Reported as F(sample,error) = __
Within the one-way anova, how can you test for differences between samples?
Use the Tukey test
What is the correlation coefficient r?
the degree to which 2 variables are correlated
Varies between 1 (perfect positive) and -1 (perfect negative)
What is the range for
a) Very weak correlation
b) modest correlation
c) very strong correlation
a) 0 - 0.2
b) 0.4-0.7
c) 0.9-1
What is covariance and how is it calculated?
measure of correlation
sum of products / degrees of freedom (n-1)
How is statistical significance of covariance checked?
Looking up the value of r for a given number of degrees of freedom in a table for critical values for r
- In minitab it is a Pearson correlation
What are the requirements for using r as a measure of correlation? (6)
> Data should be continuous or interval variables
The distribution of each variable needs to be normal->Check for Normality I.e. Anderson-Darling test and probability plot
The relationship between x and y must be linear
Check linearity using a plot
If not linear, data transformations can be attempted
What is the Rsquared value?
The coefficient of determination.
Tells us whether the independent variable(s) we fit to our data analyses or models satisfactorily explain our dependent variable
What is the regression line equation and what does each part mean?
y = a + bx y = constant + (slope X number of x units)
what does it mean when a horizontal line is above x or y?
The mean of
What are the main differences between regression and correlation?
Regression establishes an equation that assumes x affects y.
Correlation establishes how they co-vary.
Regression can be used for prediction. e.g y is __ so x is __
Regression uses F statistic and t test to give P values
Correlation uses a correlation coefficient to indicate p values
What is the power of a statistical test?
Probability that it will yield statistically significant results
power of an analysis can vary from 0 to 1
What things affect power?
- Sample size
- Strength of the effect under study (e.g. strong relationship etc.)
- The variability of the data
What is the power effect size? (d)
strength of the biological effect and its variability are combined into a measure
e.g. for the difference between 2 means:
d = (m1-m1)/SD
Range 0-1
What is the Mann-Whitney test used for?
To compare the medians of two unpaired non-parametric samples
What is the Wilcoxon test used for?
To compare the medians of two paired non-parametric samples
What is the spearmans rank used for?
Non parametric, used with variables that are proportions/counts
All observations are converted to ranks
Significance is checked by looking up the value of spearmans rank for a given number of observations on a table of critical values
Is my data parametric?
If it is not continuous, it usually is parametric/non parametric
If it is non-normal it is parametric/non parametric
NOT parametric
NOT parametric