Part 1 Flashcards

(83 cards)

1
Q

Observation –> … –> … –> …

A

question, hypothesis, prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Observation: Gammarus occurs almost entirely under stones (rather than open streams)

Question: … … Gammarus spend most of its time under stones?

A

why does

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Hypothesis - an … proposed to account for observed facts - there is often more than one hypothesis generated
e.g.

Gammarus occurs under stones because:

  • need to shelter from current
  • their food gets trapped and accumulates under stones
  • they are subject to predation by visually hunting fish and need to remain out of sight
A

explanation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Predictions - what you would … … … if the hypothesis was true - should be testable and ideally unique to hypothesis it is based on

e.g. shelter hypothesis - a greater proportion of gammarus should be found in the open in streams with slow flow (or slower flowing areas of a stream)

predation hypothesis - gammarus should aggregate under stones more in streams where fish are present than where they are not

A

expect to see

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Hypotheses are … or not …, but rarely …

A

rejected, rejected, proved

  • just bc one hypothesis is supported doesn’t mean there isn’t another underlying explanation - can’t think of all possible hypotheses - with the right evidence we can be sure that hypotheses cannot be true
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cycle of proposing hypotheses and then seeking evidence potentially capable of falsifying them is the scientific process often termed …

A

falsificationism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A variable is…

A

any characteristic that can be measured or experimentally controlled on different items or objects

  • numeric or non-numeric (e.g. colour)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A set of related variables is known as a … …

A

data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Numeric variables can be categorised as belonging to … or … scales

A

interval, ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Categorical variables can be characterised as … or …

A

nominal, ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Nominal variables…

A

arise when observations are recorded as categories that have no natural ordering relative to one another, e.g. marital status, sex, colour morph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ordinal variables…

A

occur when observations can be assigned some meaningful order, but where the exact ‘distance’ between items is not fixed, or even known, e.g. degree of aggressiveness sorted into the categories: initiates attack (3), aggressive display (2), ignores (1), retreats (0).

Rank orderings are also a type of ordinal data (e.g. place in a race - 1st 2nd 3rd etc.)

  • can say something about relationship between categories: larger score = more aggressive response, greater score = slower runner. But cannot say aggressiveness score of 2 is twice as aggressive as a score of 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interval scale variables take values on a … numerical scale, but where the scale starts at an … point. e.g. … on a … scale but not on a … scale

A

consistent, arbitrary, temperature, celsius, Kelvin

  • can say difference between 60 and 70 degrees C is the same as that between -20 and -10, but cannot say 60 degrees C is double the temperature of 30 degrees C
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ratio scale variables have a true … and a known consistent mathematical relationship between any points on the measurement scale, e.g. … scale for temperature

A

zero, kelvin

  • on Kelvin scale 60K is double the temperature of 30K
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Can meaningfully … or … with interval scales, but cannot meaningfully …, as you can with ratio scales

A

add, subtract, multiply

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In general … variables are the best suited to statistical analysis

A

ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Accuracy is…

A

how close a measurement is to the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Precision is…

A

how repeatable a measure is, irrespective of whether it is close to the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The number of … … we use suggests something about the precision of the result. A value of 12.4 actually measured with the same precision as 12.735 should properly be written …

A

significant figures, 12.400

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Usually the worst form of error is …, a … lack of accuracy

A

bias, systematic (the data are not just inaccurate but all tend to deviate from the true measurements in the same direction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

E.g.s of bias:

  • …-… sampling
  • … of biological material
  • … by the process of investigation (e.g. adrenaline increased by process of sampling adrenaline in blood)
  • … bias
A

non-random (selective sampling techniques), conditioning, interference, investigator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does a population mean in statistics?

A

Any group of items that share certain attributes or properties

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The goal of statistics is to learn something about … by … data collected from them

A

populations, analysing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical populations are defined by the …

A

investigator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is a population parameter?
A numeric quantity that describes a particular aspect of the variables in the populations (describes a feature of the distribution of variables in the population) - e.g. population mean, variance, correlation
26
The sample chosen must be as ... as possible of the whole population
representative
27
A point estimate is useless on its own, as estimates are always derived from a ... ... of the wider population. They must be accompanied by a value of ....
limited sample, uncertainty
28
The chance variation that arises in different estimates using different random samples is known as ... ...
sampling error (or sampling variation)
29
The sampling distribution is the the distribution we expect a particular estimate to follow
yes
30
sample size is often denoted as "..."
n
31
Sampling error is ... as sample size is ...
reduced, increased
32
The standard error of an estimate is the ... ... of its ... ...
standard deviation, sampling distribution
33
R doesn't like ...
percentages (use decimals e.g. 0.4 to represent 40%)
34
... statistics works by asking "what would have happened if we were to repeat an experiment or collection exercise many times, assuming that the ... remains the same each time"
Frequentist, population then working out how likely a particular result is based on the distribution of data
35
The two most important ideas in frequentist statistics are ...-... and ... ...
p-values, statistical significance
36
Sampling with replacement: each artificial sample is called a ... ...
bootstrapped sample
37
If a probability (p) value is less than the chosen ... ... we say the result is said to be statistically significant
significance level
38
The process of assigning random labels is called ...
permutation
39
The p-value is the ... of obtaining a test statistic equal to or 'more extreme' than the ... value, assuming the ... hypothesis is true
probability, estimated, null
40
All frequentist statistical tests work by specifying a ... ... and then evaluating the observed data to see if they ... from the ... ... in a way that is inconsistent with ... variation
null hypothesis, deviate, null hypothesis, sampling
41
H0 is the ... hypothesis and H1 is the ... (or ...) hypothesis
null, test, alternative
42
The alternative hypothesis is essentially a statement of the effect we are ... ... ...
expecting to see (e.g. purple and green plants differ in their mean size)
43
... the null hypothesis is not ... the alternative hypothesis
rejecting, proving
44
Large p value means observed result is quite likely if the null hypothesis is ...
true (i.e. due to sampling variation) | - cannot reject null hypothesis (not the same as accepting the null hypothesis is true)
45
Do not confuse ... significance with ... significance
statistical, biological - a result may be statistically significant but biologically trivial, e.g. pH in open water (7.1) vs in beds of submerged vegetation (6.9) is statistically significant but a very small effect and almost certainly of no importance to all the invertebrates.
46
The significance of a result depends on a combination of three things: 1. The size of the true effect in the ... 2. The ... of the data 3. The ... size
population, variability, sample
47
We must always evaluate the ... of an analysis to determine whether or not we trust it
assumptions
48
In conceptual terms, the statistical models we use describe data in terms of a ... component and a ... component
systematic, random observed data = systematic component + random component
49
The normal distribution is completely described by its ... (a measure of "central ...") and its ... ... (a measure of dispersion)
mean, tendency, standard deviation
50
If a variable is normally distributed, then about ... of its values will fall inside an interval that is ... standard deviations wide
95%, four
51
The variable name on the left of the ~ must be the variable whose...
mean we want to compare. The variable on the right must be the indicator variable that says which group each observation belongs to.
52
Correlations are statistical measures that quantify an ... between two ... variables
association, numeric two sample t test - numeric btw categorical variables
53
A correlation quantifies, via a ... ..., the degree to which. an association tends to a certain pattern
correlation coefficient
54
If there is no relationship between the variables, the correlation coefficient will be .... The closer to ... the value, the weaker the relationship. A perfect correlation will be either ... or ..., depending on the direction.
zero, zero, +1, -1
55
A regression (not a correlation) allows us to make...
predictions about the value of one variable from the value of a second variable - as a line is fitted through the data
56
A simple linear regression allows us to predict how one variable (... ...) responds to another (... ...), using a straight-line relationship
response variable, predictor variable
57
How do we find line of best fit?
Line with lowest residual sum of squares | residuals are vertical distance from line of best fit
58
Response variable on ... axis, predictor variable on ... axis
y, x
59
Regression model: ... variable on the left of the ~, ... variable on the right
response, predictor
60
Larger F values indicate a stronger relationship between...
x and y
61
ANOVA: - Measure total variation using sum of squares of deviations from the ... ..., ... variation (within group variation = sum of squares of deviations from individual group means), and between-group variation (sum of squares of deviation of ... from the ... ...) - Convert to measures of variability that don't scale with sample size and number of groups (using ... ... ...) - each of 3 sums of squares has different d.f. value - total d.f, treatment d.f., error d.f. Then calculate mean square = sum of squares/ degrees of freedom
grand mean, residual, means, grand mean, degrees of freedom
62
Squaring negative deviations lead to...
a positive number
63
The important message is that ANOVA works by making just one comparison: the ... variation and the ... variation
treatment, error
64
One-way anova does not require ... ...
equal replication - it will work even where sample sizes differ between treatments
65
An experimental factor is a controlled variable whose levels are...
set by the experimenter
66
Anova p-value of lower than 0.05 suggests that...
at least one of the treatments is having an effect - global test of significance as it doesn't tell us anything about which means are different
67
Find standard error stuff in...
one-way anova section
68
Left skew - ... data Right skew - ... data
square, log
69
Independence: value of measurement from one object is not...
affected by the values of other objects
70
Pseudoreplication is an ... increase in the ... ... (and hence d.f.) caused by using ...-... data
artificial, sample size, non-independent
71
To carry out a t-test on paired data we have to: 1. Find the mean ... of all the pairs 2. evaluate whether this is significantly different from .... This is actually an application of the ...-... ...-...
difference, zero, one-sample t-test
72
In paired t-tests there is no need for the original data to be drawn from a ... .... It is the differences between pairs that do
normal distribution
73
What does RCBD stand for?
Randomised Complete Block Design - each block sees each treatment exactly once
74
... what you can; ... what you cannot
block, randomise
75
The only thing that distinguishes ANOVA and regressions is the..
type of predictor variable they accommodate (categorical vs numerical)
76
``` ANCOVA: residuals generated for: 1. Separate means vs grand mean 2. Common slope vs separate means 3. Separate slopes vs common slope (interaction) ```
yes
77
The word "treatment" should be used for ... rather than ... studies
experimental, observational
78
chi-squared must be carried out on the actual ... not ... or ..., or the ... of data
counts, percentages, proportions, means
79
A non-parametric test is just a catch-all term that applies to any test which doesn't assume the data are...
drawn from a specific distribution
80
Chi-squared tests are ...-...
non-parametric - as they make weak assumptions about the frequency data
81
non-parametric test calculations are done using the ... ... of the data
rank order
82
Paired t-test: Distribution of ... does not need to be normal! Only distribution of ... does!
samples, differences if differences not normally distributed - can use wilcoxon test
83
Mann_Whitney U null hypothesis: ... are the same
medians (looking for differing central tendency) - significant p-value means medians are likely to be different