Revision - Everything after lecture 5! Flashcards

1
Q

What does ANOVA stand for?

A

analysis of variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What can you do if data is not normal and you still want to use a parametric test?

A

Log 10(x)

If any values are zero do
Log10(x+1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the convenient form of variance?

A

Sum of squares (SS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are sums of squares?

A
The sum of squared deviations from the mean. (The more values the bigger the SS)
e.g.
2, 5, 11 
Mean is (2+5+11)/3 = 6
Deviations from 6 are -4, -1, +5 
Squared deviations are 16, 1, 25 
Sum of squares is 16+1+25=42 
The SS for 2, 5, 11 is 42
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we account for the number of x values in the sums of squares? (standardise)

A

The mean square:

The sum of squares divided by the degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Taking into account the sums of squares, how do we calculate analysis of variance (ANOVA)?

A

SS of all numbers =

SS within samples + SS between samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the F statistic test?

A

to find out of the variance is greater than we would expect from the variance within samples.
If the variances are equal, F = 1
Reported as F(sample,error) = __

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Within the one-way anova, how can you test for differences between samples?

A

Use the Tukey test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the correlation coefficient r?

A

the degree to which 2 variables are correlated

Varies between 1 (perfect positive) and -1 (perfect negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the range for

a) Very weak correlation
b) modest correlation
c) very strong correlation

A

a) 0 - 0.2
b) 0.4-0.7
c) 0.9-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is covariance and how is it calculated?

A

measure of correlation

sum of products / degrees of freedom (n-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is statistical significance of covariance checked?

A

Looking up the value of r for a given number of degrees of freedom in a table for critical values for r
- In minitab it is a Pearson correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the requirements for using r as a measure of correlation? (6)

A

> Data should be continuous or interval variables
The distribution of each variable needs to be normal->Check for Normality I.e. Anderson-Darling test and probability plot
The relationship between x and y must be linear
Check linearity using a plot
If not linear, data transformations can be attempted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Rsquared value?

A

The coefficient of determination.
Tells us whether the independent variable(s) we fit to our data analyses or models satisfactorily explain our dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the regression line equation and what does each part mean?

A
y = a + bx
y = constant + (slope X number of x units)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does it mean when a horizontal line is above x or y?

A

The mean of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the main differences between regression and correlation?

A

Regression establishes an equation that assumes x affects y.
Correlation establishes how they co-vary.

Regression can be used for prediction. e.g y is __ so x is __

Regression uses F statistic and t test to give P values
Correlation uses a correlation coefficient to indicate p values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the power of a statistical test?

A

Probability that it will yield statistically significant results
power of an analysis can vary from 0 to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What things affect power?

A
  • Sample size
  • Strength of the effect under study (e.g. strong relationship etc.)
  • The variability of the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the power effect size? (d)

A

strength of the biological effect and its variability are combined into a measure
e.g. for the difference between 2 means:
d = (m1-m1)/SD
Range 0-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the Mann-Whitney test used for?

A

To compare the medians of two unpaired non-parametric samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the Wilcoxon test used for?

A

To compare the medians of two paired non-parametric samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the spearmans rank used for?

A

Non parametric, used with variables that are proportions/counts
All observations are converted to ranks
Significance is checked by looking up the value of spearmans rank for a given number of observations on a table of critical values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Is my data parametric?
If it is not continuous, it usually is parametric/non parametric
If it is non-normal it is parametric/non parametric

A

NOT parametric

NOT parametric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When is a chi square test used?

A

To test on counts or frequencies of things on nominal scales (rather than difference or relationship)

26
Q

What are the types of data (e.g. Ordinal) and what do they mean?

A

> Categorical/nominal: Non numerical
Ordinal: Obvious order
Quantitative: Continuous numerical
Discrete: Discontinuous (always a whole number)

27
Q

What are the 4 types of distribution?

A

> Normal
Binomial
Negative binomial
Poisson

28
Q

What is also referred to as a homogeneity, randomness, association, independence and goodness of fit test

A

Chi-square test

29
Q

For the chi-square test, If there is a big discrepancy between the frequency we expect from the null hypothesis and the frequency we observe, the value of the calculated test statistic will (be more/less) than the critical value at the appropriate ______.

A

The value of the calculated test statistic will exceed the critical value at the appropriate number of freedoms.
We will have to reject the null hypothesis

30
Q

What are the assumptions of a chi-square test?

A

1) Random sampling

2) Independent observations

31
Q

How would you test homogeneity (similarity) of data in a program that generates random integers from 0 to 9?

A

1) Generate 100 integers
2) Compare generated frequencies to expectation
3) calculate chi-squared
4) Determine the degrees of freedom (n observations-1) = 9
5) Check critical value in table at 0.5 significance.

32
Q

Pearsons: What does p=0.008 actually mean

A

The probability of getting the results obtained* if the null hypothesis is true is 0.008

33
Q

What was the first immortal human cell line?

Where were the cells derived from?

A

HeLa

Derived from cervical cancer cells

34
Q

what are the 3 R’s in animal experimentation approval?

A

Replacement
Reduction
Refinement

35
Q

When is ethical approval not required?

A
  • For immortalised cell lines e.g. HeLa
  • Using hair and nails from living persons
  • On data that is freely available to the public and doesn’t used personal data
36
Q

Risk =

A

Risk = Likelihood X Hazard

37
Q

The funding body ____ wants all UK Universities to have a data management policy in place by May 2015

A

EPSRC

38
Q

Standard deviation and variance are very similar. Both are used to find the typical or average distance a value is to the mean…
but what is the difference between them?

A

In fact, the only difference between the two is that in the variance you don’t take the square root of the sum of the difference scores.

39
Q

If your data is more spread out (has more variability) then you will have a higher/lower standard deviation

A

higher

40
Q

What is the coefficient of variation?

A

Standard deviation / mean
(helps interpret the magnitude of the standard deviation)

e.g. If the standard deviation is .20 and the mean is .50, then the cv = .20/.50 = .4 or 40%

41
Q

What is the empirical rule of standard deviation?

A

that the bulk of the data cluster around the mean in a normal distribution
68% of values fall within ±1 standard deviation of the mean
95% fall within ± 2 standard deviations of the mean
99% fall within ± 3 standard deviations of the mean

42
Q

What is a popular way to show Q1, Q3, median and IQR

A

Box plot or box and whisker graph

43
Q

If the resulting P-value of Levene’s test is less than some critical value (typically 0.05), the obtained differences in sample variances are unlikely to have occurred based on random sampling from a population with equal variances. Thus, the null hypothesis of equal variances is ________

A

rejected

44
Q

In the Levene’s test, what does
= 0.56, p=0.651
suggest about the homogenity of the variances

A

There IS homogenity (P>0.05) because the null hypothesis that there is equality of variances is accepted

45
Q

What does 0.45 power mean

A

45% chance of getting a significant result

46
Q
Difference = mu (1) - mu (2) 
Estimate for difference: -2.887 
95% CI for difference: (-4.900, -0.874) 
T-Test of difference = 0 (vs not =): T-Value = -2.91 P-Value = 0.006 DF = 36 
Both use Pooled StDev = 3.0548 

With this information, complete the following:
t__ = ___, p=____. H0 is _____ and HA ______

A

t36 = -2.91, p=0.006. H0 is rejected and HA accepted

47
Q

Pearson’s r = 0.940, p=<0.001
The regression equation is
Hg in blood (ng/g) = - 20.6 + 0.641 Methyl Hg intake (mu g/day)
What is parameter b?

A

0.641

48
Q

What would you conclude about the correlation and significance of the following:
Pearson’s r = -0.814, p=<0.001

A

there is a strong negative correlation that is highly significant

49
Q

In a 2-way chi-square what is the rule for calculating the number of degrees of freedom?

A

2 variables, (numbers of columns – 1)(number of rows – 1)

50
Q

What is the z value and how do you calculate it

A

(Z is otherwise known as a standard score)
It indicates how many standard deviations an element is away from the mean.
z = (sample proportion (p) - hypothesised proportion (p0) ) / Standard error

51
Q

What is accuracy?

A

the closeness to the real value.

e.g. the units of measurement 5g vs 5.1g

52
Q

What is precision?

A

the closeness of repeated measures to the same value…

e.g. using the same balance to weigh something

53
Q

What are derived variables?

A

Usually calculated from two or more other variables.. for example, ratios or percentages

54
Q

What is a distribution in stats?

A

An assumption of where the data will lie

55
Q

If the variance is greater than the mean then the population is more ______ than random distribution

A

clumped/aggregated

56
Q

If the variance is less than the mean then it is more ____ than random

A

ordered/uniform

57
Q

In binomial distribution, there is a ____ distribution of number of events. When there are 2 possible outcomes for an event the probability of each is ___.

A

In binomial distribution, there is a discrete distribution of number of events. When there are 2 possible outcomes for an event the probability of each is constant/equal.

58
Q

If the distribution of individuals were highly clumped or aggregated, quadrats used to sample variance from this population would show that variance in number of individuals per quadrat would be greater/less than the mean

A

greater variance than the mean

59
Q

Negative binomial distribution is a discrete distribution that can be used to describe _____ data, and therefore variance is ____ than the mean.

A

Negative binomial distribution is a discrete distribution that can be used to describe clumped/aggregated data, and therefore variance is greater than the mean.

60
Q

What methods can you test to test normal distribution?

4

A

Kolmogorov-Smirnov
Anderson-Darling
Shapiro-Wilk
Chi-square goodness of fit

61
Q

In normal distribution ____% of the observations fall with in 1 standard deviation of the mean
____% fall within 2 SD
____% fall within 3 SD

A
  1. 25%
  2. 45%
  3. 73%
62
Q

What is Kurtosis

A

The measure of shape/flatness of distribution