BIO 330 Flashcards

1
Q

sampling error imposes

A

imprecision (accuracy intact)

caused by chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

sampling bias imposes

A

inaccuracy (precision intact)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

accurate sample

A

unbiased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

precise sample

A

low sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

good sample

A

accurate
precise
random
large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

2 types of data

A

numerical

categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

numerical data

A

continuous

discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

categorical data

A

nominal

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

types of variable

A

response

explanatory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

response variable

A

dependent
outcome
Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

explanatory variable

A

independent
predictor
x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

subsamples treated as true replicate

A

pseudoreplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

subsamples are useful for

A

increasing precision of estimate for individual samples (multiple samples from same site averaged)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

contingency table

A

explanatory- columns
response- rows
totals of columns and rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 data descriptions

A

central tendency

width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

central tendency

A

mean
median
mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

width (spread)

A
range
standard deviation
variance
coefficient of variation
IQR
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

effect of outliers on mean

A

shifts mean towards outliers- sensitive to extremes

median doesn’t shift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

sample variance s^2 =

A

sum( Y_i - Ybar )^2 / n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

coefficient of variation CV =

A

100% ( s / Ybar )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

high CV

A

more variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

skewed box plot

A

left skewed- more data in ‘bottom’- first quartile

right skewed- more data in ‘top’- 3rd quartile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

when/why random sample

A

uniform study area

removes bias in sample selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

when/why systematic sample

A

detect patterns along gradient- fixed intervals along transect/belt

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

using quadrats

A

more better

stop when mean/variance stabilize (asymptote)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what does changing n do to sampling distribution

A

reduces spread (narrows graph) - increases preciesion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

standard error of estimate SE_Ybar =

A

s / sqr rt (n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

SD vs. SE

A

SD- spread of distribution/deviation from mean

SE- precisions of an estimate (ex. mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

95% CI ~=

A

+/- 2SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

kurtosis

A

leptokurtic- sharper peak (+)
platykurtic- rounder peak (-)
mesokurtic- normal (0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Normal distribution, 1SD

A

~2/3 of the area under the curve (2SD = 95%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

random trial

A

process/experiment with ≥2 possible outcomes who occurrence can not be predicted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

sample space

A

all possible outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

event

A

any subset of the sample space (≥1 outcome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

mutually exclusive events

A

P[A and B] = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

mutually exclusive addition rule

A

P[7U11] = P[7} + P[11]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

general addition rule

A

P[AUB] = P[A] + P[B] - P[A and B]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

multiplication rule

A

independent events

P[A and B] = P[A] x P[B]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

conditional probability

A

P[A I B] = P[A and B] / P[B]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

collection of individual easily available to researcher

A

sample of convenience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

random sample

A

ever unit has equal opportunity, selection of unit independent, minimizes bias, possible to measure sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

problem with sample of convenience

A

assume unbiased/independent- no guarantee

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

volunteer bias

A

health conscious, low income, ill, more time, angry, less prudish

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

frequency distribution

A

describes # of times each value of a variable occurs in sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

probability distribution

A

distribution of variable in whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

absolute frequency

A

of times value is observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

relative frequency

A

proportion of individuals which have that value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

experimental studies can

A

determine cause and effect

*cause

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

observational studies can

A

only point to cause

*correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

quantifying precision

A

smaller range of values (spread)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

determining accuracy

A

usually can’t- don’t know true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

nominal categorical data with 2 choices

A

binomial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

why aim for numerical data

A

it can be converted to categorical if need be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

species richness

A

discrete (count)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

rates

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

large sample

A

less effected by chance
lower sampling error
lower bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

rounding

A

round to one decimal place more than measurement (in calculations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

higher CV

A

more variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

proportions

A

p^ = # of observations in category of interest/ total # of observations in all categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

sum of squares

A

it is squared so that each value is +, so they don’t cancel each other out
n-1 to account for population bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

CV used for

A

relative measures- comparing data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

sampling distribution

A

probability distribution of all values for an estimate that we might obtain when we sample a population, centred at true µ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

values outside of CI

A

implausible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

how many quadrats to use

A

till cumulative number of observations asymptotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

law of total probability

A

P[A] = Σ P[B].P[A I B]

for all B_i ‘s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

null distribution

A

sampling distribution for test statistic, if repeated trials many time and graphed test statistics for H_o

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Type I error

A

P[Reject Ho I Ho true] = alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

reject null

A

P-vale < alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Type II error

A

P[do not reject Ho I Ho false]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Power

A

P[Reject Ho I Ho false]
increases with large n
decreases P[Type II E]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

test statistic

A

used to evaluate whether data are reasonably expected under Ho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

p-value

A

probability of getting data as extreme or more, given Ho is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

statistically significant

A

data differ from H_o

not necessarily important- depends on magnitude of difference and n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

why not reduce alpha

A

would decrease P[Type I] but increase P[Type II]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

continuous probability

P[Y = y] =

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

sampling without replacement

A

ex. drawing cards

1/52).(1/51).(1/50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Bayes Theorem

A

P[A I B] = ΣP[B I A].P[A] / P[B]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

P-value > alpha

A

do not reject Ho

data are consistent with Ho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

meaning of ‘z’ in standardization

A

how many sd’s Y is from µ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

standardization for sample mean, t =

A

Ybar - µ / (s / sq.rt. n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

CI on µ

A

Ybar ± SE.tcrit
SE of Ybar
t of alpha(1 or 2), degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

1 sample t-test

A

compares sample mean from normal pop. to population µ proposed by Ho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

why n-1 account for sampling error

A

last value is not free to vary if mean is a specified value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

1 sample t-test assumptions

A

data are a random sample

variable is normally distributed in pop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

paired t-test assumptions

A

pairs are a random sample from pop.

paired differences are normally distributed in the pop.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

how to tell whether to reject with t-test

A

if test statistic is further into tails than critical t then reject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

2 sample design compares

A

treatment vs. control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

2 sample t-test assumptions

A

both samples are random samples
variable is normally distributed in each group
standard deviation in two groups ± equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

degrees of freedom

A

1 sample t-test: n - 1
paired t-test: n - 1
2 sample t-test: n1 + n2 - 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

confounding variables

A

mask/distort causal relationships btw measured variables
problem w/ observational studies
impossible to differentiate 1 variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

experimental artifacts

A

bias resulting from experiment, unnatural conditions
problem w/ experimental studies
should try to mimic natural environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

minimum study design requirements

A

knowledge of initial/natural conditions via preliminary data to ID hypotheses and confounding variables
controls to reduce bias
replication to reduce sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

study design process

A

develop clear statement of research question
list possible outcomes
develop experimental plan
check for design problems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

developing a clear statement of research question

A

ID question, Ho, Ha
choose factors, response variable
what is being testes? will the experiment actually test this?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

list possible outcome of experiment

A

ID sample space
explain how each outcome supports/refutes Ho
consider external risk factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

develop experimental plan

based on step 1

A

outline different experimental designs

check literature for existing/accepted designs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

develop experimental plan based on step 2

A

what kind of data will you have- aim for numerical

what type of statistical test will you use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

minimize bias in experimental plan

A

control group
randomization
blinding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

minimize sampling error in experimental plan

A

replication
balance
blocking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

types of controls

A

positive

negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

positive control

A

treatment that should produce obvious, strong effect

ensuring experiment design doesn’t block effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

negative control

A

subjects go through all same steps but do not receive treatment- no effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

maintaining power with controls

A

add controls w/o reducing sample size- too many controls samples using up resources will reduce power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

placebo effect

A

improvement in condition from psychological effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

randomization

A

breaks correlation btw explanatory variable and confounding variables (averages effects of confounding variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

blinding

A

conceals from subjects/researchers which treatment was received
prevent conscious/unconscious changes in behaviour
single blind or double blind

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

better chance of IDing treatment effect if

A

sample error/noise is minimized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

replication =

A

smaller SE, tighter CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

spacial autocorrelation

A
each sample is correlated w/ sample area
not independent (unless testing differences in that population)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

temporal autocorrelation

A

measurement at one pt in time is directly correlated w/ the one before/after it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

balance =

A

small SE, narrow CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

blocking

A

accounts for extraneous variation by putting experimental units that are similar into ‘blocks’
only concerned w/ differences within block- differences btw blocks don’t matter
lowers noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

factorial design

A

most powerful study design
study multiple treatments and their interactions
equal replication of all combinations of treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

checking for pseudoreplication

A

check degrees of freedom, very large- problem

overestimate = easier to reject Ho- pretending we have more power than we do

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

determining sample size, plan for

A

precision, power, data loss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

determining sample size, wanting precision

A

want low CI
n ~ 8(sigma/uncertainty)^2
uncertainty is 1/2 CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

determining sample size, wanting power

A
detecting effect/difference
plan for probability of rejecting a false Ho
n~16(sigma/D)^2
D is min. effect size you want to detect
power is 0.8
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

ethics

A

avoid trivial experiment
collaborate to streamline efforts
substitute models for live animals when possible
keep encounters brief to reduce stress

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

most important in experimental study design

A

check common design problems
sample size (precision,power,data loss)
get a second opinion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

most important in observational study design

A

keep track of confounding variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
121
Q

good skewness range for normality

A

[-1,1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
122
Q

normal quantile plot

A

QQ plot

compares data w/ standardized value, should follow a straight line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
123
Q

right skew in QQ plot

A

above line (more positive data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
124
Q

Shapiro-Wilk test

A

works like Hypothesis test, Ho: data normal
estimate pop mean and SD using sample data, tests match to normal distribution with same mean and SD
p-value < alpha, reject Ho (don’t want to reject)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
125
Q

testing normality

A

Histogram
QQ plot
Shapiro-Wilk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
126
Q

normality tests sensitive

A

especially to outliers, over-rejection rate
sensitive to sample size
large n = more power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
127
Q

testing equal variances

A

Levene’s test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
128
Q

Levene’s test

A

Ho: sigma1 = sigma2
difference btw each data point and mean, test difference btw groups in the means of these differences
p-value < alpha reject (don’t want to reject)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
129
Q

how to handle violations of test assumptions

A

ignore it
transform data
use nonparametric test
use permutation test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
130
Q

when to ignore normality

A

CLT- n >30 —-means are ~normally distributed
depends on data set though
can’t ignore normality and compare one set skewed left with one skewed right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
131
Q

when to ignore equal variances

A

n large, n1 ~ n2

3 fold difference in SD usually ok

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
132
Q

if can’t ignore violation of equal variances

A

Welch’s t-test- computes SE and df differently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
133
Q

most common transformations

A

log, arcsine, square-root

log- only in data all > 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
134
Q

nonparametrics

A

assume less about underlying distributions
usually based on rank data
Ho: ranks are same btw groups
sign test (instead of t test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
135
Q

sign test

A

compares median to median in Ho

each data pt- record whether above (+) or below (-) the Ho median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
136
Q

if Ho is true in sign test

A

half data will be above Ho, half will be below

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
137
Q

sign test p-value

A

use binomial distribution– probability of getting your measurement if Ho true, compare to alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
138
Q

binomial

A

P[Y≤y] = Σ(n choose y)(p)^y(1-p)^n-y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
139
Q

Mann-Whitney U-test

A

compare 2 groups using ranks
doesn’t assume normality
assumes distributions are same shape
rank all data from both groups together, sum ranks for individual groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
140
Q

Mann-Whitney U-test equation

A
U1 = n1n2 + [(n1(n1+1)/2] - R1
U2 = n1n2 - U1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
141
Q

interpreting Mann-Whitney U-test

A

choose larger of U1, U2 (test statistics)- compare to critical U from U distribution (table E)
note that Ucrit = U_alpha,(2 sided), n1, n2
used n1, n2 not DF
U < Ucrit d.n.r. Ho (2 groups not statistically different)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
142
Q

why Mann-Whitney doesn’t use DF

A

not looking at estimating mean/variance, just comparing the shapes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
143
Q

problem with non-parametrics

A

low power- P[Type II] higher– especially with low n
ranking data = major info loss
avoid use
Type I not altered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
144
Q

comparing > 2 groups

A

ANOVA - analysis of variance

Ho: µ1 = µ2 = µ3 = µ4….

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
145
Q

why use ANOVA

A

multiple t-tests to compare >2 groups increase Type I error- more tests = higher chance of falling within alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
146
Q

P[Type I]

A

1 - ( 1 - alpha ) ^N
N is number of t-tests you do
ex. 5 groups- 10 unique tests- P[TI] = 0.4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
147
Q

ANOVA tests

A

is there more variation btw groups than can be attributed to chance- breaks it down into: total variation, btw group variation, within group variation
maintains P[TI] = alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
148
Q

between-group variation

A

effect of interest (signal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
149
Q

within-group variation

A

sampling error (noise)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
150
Q

2x2 ANOVA design

A

take 2 different variables– look at all combinations and see if any effects between them in all directions
2 variables w/controls = 8 options

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
151
Q

Hypothesis test steps

A

State Ho, Ha
calculate test statistic
determine critical value of null distribution (or P-value)
compare tests statistic to critical value (or P-value to sig. level)
evaluate Ho using alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
152
Q

why use alpha = 0.05

A

balances Type I error and Type II error

153
Q

why are Type I and II errors conceptual

A

we don’t know whether or not Ho is actually true

154
Q

paired t-test is a type of

A

blocking

155
Q

where does pseudoreplication happen/become a problem

A

data analysis stage, doesn’t happen at data collection stage (subsamples)

156
Q

ANOVA maintains

A

P[Type I Error] = alpha

157
Q

ANOVA, Y bar

A

grand mean, main horizontal line, test for differences between grand mean and group means

158
Q

ANOVA, Ho: F-ratio =

A

~1

159
Q

ANOVA, if Ho is true, MSerror

A

= MS groups; same variation within and btw goups

160
Q

ANOVA, MSgroup > MSerror

A

more variation between groups than within

161
Q

ANOVA, test statistic

A

F-distribution, F_0.05,(1),MSgroup DF, MSerror DF = critical value
compare critical value to F-ratio
this is a one sided distribution we are looking for whether F-ratio is bigger than critical value (strictly)

162
Q

ANOVA, F-ratio > F-critical

A

Reject Ho.. at least one group mean is different than the others

163
Q

ANOVA, quantifying variation resulting from “treatment effect”

A

R^2 = SSgroups/SStotal

R^2 [0,1]

164
Q

ANOVA, high R^2

A

more of the variation can be explained by the treatment, usually want at least 0.5

165
Q

ANOVA, R^2 = 0.43

A

43% of total variation is explained by differences in treatment

166
Q

ANOVA, R^2 = low values

A

noisy data

167
Q

ANOVA assumptions

A

Random samples from populations
Variable is normally distributed in each k population
Equal variance in all k populations

168
Q

ANOVA unmet assumptions

A

large n, similar variances– ignore
variances very different– transform
non-parametric– Kruskal-Wallis

169
Q

ANOVA, which group(s) were different

A

Planned or Unplanned comparison of means

170
Q

Planned comparisons of means (ANOVA)

A

comparison between means planned during study design, before data is obtained; for comparing ONE group w/ control (only 2 means); not common

171
Q

Unplanned comparisons of means (ANOVA)

A

comparisons to determine differences between all pairs of mean; more common; controls Type I error

172
Q

Planned comparison calculations (ANOVA)

A
like a 2-sample t-test
test statistic: t =(Ybar1 - Ybar2)/SE
SE= √ MSerror (1/n1 + 1/n2)
note that we use error mean square instead of pooled variance (as in a normal t-test)
df = N-k
t critical= t0.05(2), df
173
Q

Unplanned comparison of means (ANOVA)

A

Tukey-Kramer

174
Q

why do you need to know what kind of data you have

A

determines what kind of statistical test you an do

175
Q

left skew

A

mean < median

skew ‘pulls’ mean in direction of skew

176
Q

C.I. notation

A

95% CI: a < µ < b (units)

177
Q

accept null hypothesis

A

NEVER!!!

only REJECT or FAIL TO REJECT

178
Q

why do we choose alpha = 0.05

A

it balances TIE and TIIE which are actually conceptual, since we don’t know if Ho is actually true or not

179
Q

standard error or estimate

A

standard deviation of its sampling distribution; measures precision of the estimate

180
Q

SD vs. SE

A

SD- SPREAD of a distribution, deviation from mean

SE- PRECISION of an estimate; SD of sampling distribution

181
Q

test statistics

A

used to evaluate whether the data is reasonably expected under the Ho

182
Q

P-value

A

probability of getting the data, or something more unusual, given Ho is true

183
Q

reject Ho if

A

p-value ≤ alpha
less than OR equal to
0.049, 0.05

184
Q

Steps in hypothesis testing

A
  1. State Ho and Ha
  2. Calculate test statistic
  3. Determine critical value or P-value
  4. Compare test statistic to critical value
  5. Evaluate Ho using sig. level (and interpret)
185
Q

Type I error

A

Reject Ho, given Ho true

186
Q

Type II error

A

Do not reject Ho, given Ho is false

187
Q

If we reduce alpha

A

P[Type I] decreases, P[Type II] increases

188
Q

Experimental design steps

A
  1. Develop clear statement of research question
  2. List possible outcomes
  3. Develop experimental plan
  4. Check for design problems
189
Q

How to minimize bias

A

control group, randomization, blinding

190
Q

How to minimize sampling error

A

replication- lare n lowers noise
balance- lowers noise
blocking

191
Q

to avoid pseudoreplication

A

check df- obviously if its huge something is wrong

192
Q

Tukey-Kramer

A

for 3 means: three Y bars, three Ho’s; Q distribution; 3 row table w/ group i, group y, difference in means, SE, test statistic, critical q, outcome (reject/do not)

193
Q

Q-distribution

A

symmetrical, uses larger critical values to restrict Type I error; more difficult to reject null

194
Q

Tukey-Kramer test statistic

A
q = Y_i(bar) - Y_j(bar) / SE
SE = √ MSerror(1/n1 + 1/n2)
195
Q

Tukey-Kramer testing

A

test statistic, q-value
critical value, q_α,k,N-k
k = # groups
N = total # observations

196
Q

Tukey-Kramer assumptions

A

random samples
data normally distributed in each group
equal variances in all groups

197
Q

2 Factor ANOVA

A

2 Factors = 3 Ho’s: difference in 1 factor, difference in 2nd factor, difference in interaction

198
Q

If interaction is significant

A

do not conclude that factor is not

199
Q

Interaction plots

A

y-axis: response variable
x-axis: one of 2 main factors
legend for: other of 2 main factors (different symbols or colors)
2 lines

200
Q

interpreting interaction plot, interaction

A

lines parallel: no significance in interaction

201
Q

interpreting interaction plot, b (data not on x-axis)

A

take average along each line and compare the 2 on the y-axis, if they are not close then they are significant

202
Q

interpreting interaction plot, a (data on x-axis)

A

x-axis: take average between the 2 dots (for each level of a), compare on y-axis, if they are not close they are significant

203
Q

control groups in an observational/experimental study will

A

reduce bias

will not affect sampling error

204
Q

correlation ≠

A

causation

205
Q

correlation

A

“r”- comparing 2 numerical variables, [-1,1], no units, always linear
quantify strength and direction of LINEAR relationship (+/-)

206
Q

how to calculate correlation

A
r = signal/noise
signal= deviation in x and y together for every point (multiply each deviation before summing)
207
Q

correlation Ho

A

no correlation between interbreeding and number of pup surviving their first winter (ρ = 0)

208
Q

determining correlation

A
test statistic: r/SE_r
SE_r = √ (1-r^2) / (n-2)
df = n-2 
critical: tα,(2),df
compare statistic w/ critical
209
Q

df

A

n - number of parameters you estimate
correlation- you estimate 2
mann whitney- 0 parameters

210
Q

stating correlation results

A

be careful not to interpret– no causation!

211
Q

understanding r

A

easy to understand because of lack of units, however, can trick you into thinking comparable across studies- across studies need to limit ranges

212
Q

Attenuation bias

A

if x or y are measured with error, r will be lower; with increasing error, r is underestimated; avoided by taking means of subsamples

213
Q

correlation and significance

A

statistically sig. relationships can be weak, moderate, strong
sig.– probability, if Ho is true
correlation– direction, strength of linear relationship

214
Q

weak, moderate, strong correlation

A
r = ±0.2 –weak
r = ±0.5 – moderate
r = ±0.8 – strong
215
Q

correlation assumptions

A

bivariate normality- x and y are normal

relationship is linear

216
Q

dealing with assumption violations (correlation)

A

histograms
transformations in one or both variables
remove outlier

217
Q

outlier removal

A

–need justification (i.e. data error)
–carefully consider if variation is natural
–conduct analyses w/ and w/o outlier to assess effect of removal

218
Q

natural variation, outliers

A

is your n big enough to detect if that is natural variation in the data

219
Q

if outlier removal has no effect

A

may as well leave it in!

220
Q

non-parametric Correlation

A

Spearman’s rank correlation; strength and direction of linear association btw ranks of 2 variables; useful for outlier data

221
Q

Spearman’s rank correlation assumptions

A

random sampling

linear relationship between ranks

222
Q

Spearman’s rank correlation

A

r_s: same structure as Pearson’s correlation but based on ranks
r_s = [Σ(Ri-Rbar)(Si-Sbar)] / [ Σ(Ri-Rbar)^2Σ(Si-Sbar)^2 ]

223
Q

conducting Spearmans

A

rank x and y values separately; each data point will have 2 ranks; sum ranks for each variable; n = # data pts.; divide each rank sum by n to get Rbar and Sbar; calculate r_s (statistic); calculate critical r_s(0.05,df)

224
Q

if 2 points have same rank (Spearman)

A

average of that rank and skip rank before/after; w/o any ties, the 2 values on the bottom of r_s equation will be the same

225
Q

Spearman hypothesis

A

ρ_s = 0, correlation = 0

226
Q

Spearman df

A

df = n because no estimations are being made in ranking

227
Q

linear regression

A

–relationship between x and y described by a line
–line can predict y from
–line indicates rate of change of y with x
Y = a + bX

228
Q

correlation vs. regression

A

regression assumes x,y relationship can be described by a line that predicts y from x

corr. - is there a relationship
reg. - can we predict y from x

229
Q

perfect correlation

A

r = 1, all points are exactly on the line– regression line fitted to that ‘line’ could be the exact same line for a non-perfect correlation

230
Q

rounding mean results

A

DO NOT; 4.5 puppies is a valid answer

231
Q

best line of fit

A

minimizes SS = least squares regression; smaller sum of square deviations

232
Q

used for evaluating fit of the line to the data

A

residuals

233
Q

residuals

A

difference between actual Y value and predicted values for Y (the line); measure scatter above/below the line

234
Q

calculating linear regression

A

calculate slope using b = formula; find a– a = Ybar - bXbar; plug in to Ybar = a + bXbar; rewrite as Y = a + bX; rewrite using words

235
Q

Yhat

A

predicted value- if you are trying to predict a y value after equation has been solved

236
Q

why do we solve linear regression with Xbar, Ybar

A

line of fit always goes through Xbar, Ybar

237
Q

how good is line of fit

A

MSresiduals = Σ(Yi - Yhat)^2 / n-2
which is SSresidual / n-2
quantifies fit of line- smaller is better

238
Q

Prediction confidence, linear regression

A

precision of predicted mean Y for a given X

precision of predicted single Y for a given X

239
Q

Precision of predicted mean Y for a given X, linear regression

A

narrowest near mean of X, and flare outward from there; confidence band– most confident in prediction about the mean

240
Q

precision of predicted single Y for a given X, linear regression

A

much wider because predicting a single Y from X is more uncertain than predicting the mean Y for that X

241
Q

extrapolating linear regression

A

DO NOT extrapolate beyond data, can’t assume relationship continues to be linear

242
Q

linear regression Ho

A

Slope is zero (β = 0), number of dees cannot be predicted from predator mass

243
Q

linear regression Ha

A

slope is not zero (β ≠ 0), number of dees can be predicted from predator mass (2 sided)

244
Q

Hypothesis testing of linear regression

A

testing about the slope:
–t-test approach
–ANOVA approac

245
Q

Putting linear regression into words

A

Dee rate = 3.4 - 1.04(predator mass)

Number of dees decreases by about 1 pre kilo of predator mass increase

246
Q

testing about the slope, t-test approach

A
test statistic t = b–β_o / SE_b
SE_b = √MSresidual/Σ(Xi-Xbar)^2
MSres. = Σ(Yi-Yhat)^2 / n-2
critical t = t_α(2),df
df = n - 2
compare statistic, critical
247
Q

testing about the slope, ANOVA approach

A

source of variation: regression, residual, total

sum of squares, df, mean squares, F-ratio

248
Q

calculating testing about the slope, ANOVA approach

A
SSregres = Σ(Yi^ - Ybar)^2
SSresid. = Σ(Yi-Yi^)^2
MSreg. = SSreg/df  df=1
MSresid = SSres/df  df=n-2
F-ratio = MSreg/MSres.
SStotal = Σ(Yi-Ybar)^2
df total = n-1
249
Q

interpreting ANOVA approach to linear regression

A

If Ho is true, MSreg. = MSres

250
Q

% of variation in Y explained by X

A

R^2 = SSreg/SStotal

a% of variation in Y can be predicted by X

251
Q

Outliers, linear regression

A

create non-nomral Y-value distribution, violate assumption of equal variance in Y, strong effect on slope and intercept; try not to transform data

252
Q

linear regression assumptions

A

linear relationship
normality of Y at each X
variance of Y same for every X
random sampling of Y’s

253
Q

detecting non-linearity

A

look at the scatter plot, look at residual plot

254
Q

checking residuals

A

should be symmetric above/below zero
should be more points close line (0) than far
equal variance at all values of x

255
Q

non-linear regression

A

when relationship is not linear, transformations don’t work, many options- aim for simplicity

256
Q

quadratic curves

A

Y = a + bX + cX^2
when c is negative, curve is humped
when c is positive, curve is u shaped

257
Q

multiple explanatory variables

A

improve detection of treatment effects
investigate effects of ≥2 treatments + interactions
adjust for confounding variables when comparing ≥2 groups

258
Q

GLM

A

general linear model; multiple explanatory variables can be included (even categorical); response variable (Y) = linear model + error

259
Q

least-squares regression GLM

A
Y = a + bX 
error = residuals
260
Q

single-factor ANOVA GLM

A
Y = µ + A
error = variability within groups
µ = grand mean
261
Q

GLM hypotheses

A

Ho: response = constant; response is same among treatments
Ha: response = constant + explanatory variable

262
Q

constant

A

constant = intercept or grand mean

263
Q

variable

A

variable = variable x coefficient

264
Q

ANOVA results, GLM

A

source of variation: Companion, Residual, Total

SS, df, MS, F, P

265
Q

ANOVA, GLM F-ratio

A

MScomp. / MSres.

266
Q

ANOVA, GLM R^2

A

R^2 = SScom. / SStot.

% of variation that is explained

267
Q

ANOVA, GLM, reject Ho

A

Model with treatment variable fits the data better than the null model but only 25% of the variation is explained

268
Q

Multiple explanatory variables, goals

A

improve detection of treatment effects
adjust for effects of confounding variables
investigate multiple variables and their interaction

269
Q

design feature for improving detection of treatment effects

A

blocking

270
Q

design feature for adjusting for effects of confounding variables

A

covariates

271
Q

design feature for investigating multiple variables and their interaction

A

factorial design

272
Q

experiments with blocking

A

account for extraneous variation by putting experimental units into blocks that share common features
ex. instead of comparing randomly dispersed diversity, look at response variable within a block

273
Q

GLM, blocking

A

Ho: mean prey diversity is same in every fish abundance treatment
Ho: Diversity = grand mean + block
Ha: mean prey diversity is not the same in every fish abundance treatment
Ha: diversity = grand mean + block + fish abundance

274
Q

ANOVA, GLM, blocking

A

source of var.: block, abundance, residual, total

SS, df, MS, F, P

275
Q

Blocking Ho

A

Ho: mean prey diversity is the same in each block
Ha: mean prey diversity is not the same in each block
Block R^2 = SSblock / SStotal
Abundance + block R^2 =
SSabun. + SSblock / SStotal

276
Q

block as a variable

A

block is an explanatory variable even if we are not inherently interested in its effect b/c it contributes to variation

277
Q

covariates

A

reduce confounding variables, reduce bias

278
Q

ANCOVA, GLM

A

Response = constant + explanatory + covariate

279
Q

ANOCVA hypotheses

A

Ho:No interaction between caste and body mass
Response = constant + exp. + covariate
Ha: Interaction between caste and body mass
Response = cons. + exp + cov. + explanatory*covariate

280
Q

ANCOVA hypotheses graphs

A

Ho: parallel
Ha: not parallel
affect is measured as the vertical difference between the two lines

281
Q

Testing ANCOVA

A

are the slopes equal

if not significant, drop interaction term and run model again

282
Q

df of interaction =

A

df_covariant * df_explanatory

283
Q

Factorial design

A

multiple explanatory variables

fully factorial- every level of every variable and interaction is studied

284
Q

Factorial GLM statements

A

Ha: algal cover = grand mean + herbivory + height + herbivory*height
Ho: a.c. = G.M. + Herb. + Height

285
Q

GLM null hypotheses

A

do not include interaction statements

always one term different from alternative

286
Q

GLM degrees of freedom

A
explanatory:
 df = levels of treatment - 1
interaction:
 df = df_exp.1 * df_exp.2
df always total to grand n - 1
287
Q

Factorial GLM hypotheses graphs

A

Ho: no interaction = parallel lines
Ha: interaction = non parallel, maybe crossing lines

288
Q

Probability of independent events

A

P[X] = P[A]P[B]P[C]*….

if multiple ways to arrive at P[X] then add them up, or use Binomial (if conditions met)

289
Q

Binomial distribution

A

probability distribution for # of successes in a fixed n of independent trials

290
Q

Binomial conditions

A

independent
probability of success is same for each trial
2 possible outcomes- success/failure

291
Q

proportion equations

A
p^ = X/n
SE_p^ = √ [p^ (1-p^)] / [n–1]
292
Q

Binomial test, testing proportions

A

whether relative frequency of successes in a population matches null expectation
Ho: p = p_o

293
Q

law of large numbers

A

higher n = better estimate of p (or any estimate for that matter), lower SE

294
Q

binomial testing proportions calculations

A

test statistic = observed number of successes

null expectation = null ‘p’ * number of ‘trials’ (weighted by trials)

295
Q

steps in finding binomial p-value

A

use null ‘p’ in binomial to calculate observed successes + anything more extreme; multiply by 2 (2 sided test)- this is the p-value; not comparing to critical value; compare to alpha

296
Q

binomial, p < 0.001

A

reject Ho, p^ is significantly different than Ho: p = under a proportional model

297
Q

95% CI for a population parameter

A

p’ = ( X + 2 ) / ( n + 4 )
p’ ± Z √ [p’ (1–p’)] / [n+4]
Z = 1.96 for 95% CI

298
Q

> 2 possible categories

A

X^2 goodness-of-fit test

compare frequency data w/ >2 possible outcomes to frequencies expected from probability model in Ho

299
Q

Bar graphs

A

categorical data

space between bars

300
Q

X^2 example (days)

A

Ho: # of births is the same on each day

births on Monday is proportional to # of Mondays in the year

301
Q

X^2

A

test statistic measures discrepancy btw observed (data) and expected (Ho) frequencies

302
Q

X^2 calculations

A
find E for each group, then X^2 for each group, sum X^2 = test statistic, compare to critical value
E = n*p
X^2 = Σ (O – E)^2 / E
df = # categories – 1
critical X^2_α,df
303
Q

Sampling distribution for Ho, binomial

A

Histogram- sampling distribution for all possible values for X^2
black line- theoretical X^2 probability distribution

304
Q

higher X^2 values

A

observed farther from expected

305
Q

X^2, why -1 in df

A

using n to calculate expected value- restricts data

306
Q

X^2 reject Ho

A

data do not fit a proportional model, births are not equally distributed through the week

307
Q

X^2 goodness-of-fit assumptions

A

random sample
no category has expected frequency > 1
no more than 20% of the categories have expected frequencies < 5

308
Q

Poisson distribution

A

describes probability of success in a block of time or space, when successes happen independently and with equal probability

309
Q

distribution of points in space

A

clumped
random
dispersed

310
Q

Poisson, P[X successes] =

A
E = e^-µ . µ^x / X!
µ = mean # of independent successes
311
Q

Poisson hypotheses

A

Ho: number of extinctions per time interval has a Poisson distribution
Ha: number of extinctions do not follow a Poisson distribution

312
Q

calculate a mean from a frequency table

A

µ = (n1f1)+(n2f2)+(n3*f3)+…. / n

313
Q

hypothesis testing, poisson

A

calculate probability of success (expected value) for each level; calculate X^2 for each level, sum them; compare to critical value
df = # categories - 1

314
Q

determining if data are clumped or dispersed

A

s^2 =
[ Σ (Xi - µ)^2 * (obs. frequency)] / (n–1)
clumped: s^2 > µ
dispersed: s^2 < µ

315
Q

X^2 used for

A

proportional
binomial
poisson

316
Q

rejecting Ho, binomial

A

probability of success is not same in all trials or trials are not independent

317
Q

rejecting Ho, poisson

A

successes are not independent, probability of success is not constant over time or space

318
Q

contingency analysis

A
whether one variable depends on the other (is contingent on)
in a contingency table
explanatory variable in columns
response variable in row
each subject appears in table once
319
Q

contingency Ho

A

no relationship between variables, variables independent

320
Q

associating categorical variables

A

test for association between ≥2 categorical variables
are categorical variables independent
odds ratio
X^2 contingency test

321
Q

odds ratio

A

to measure magnitude of association between 2 variables when each has only 2 categories
odds: O^ = p^ / 1–p^
odds ratio: OR = O1^ / O2^

322
Q

X^2 contingency test

A

to test whether the 2 variables are independent; to test association between 2 categorical variables; need expected frequencies for each cell under Ho

323
Q

OR =

A

OR=1 : odds same for both groups

OR>1 : odds higher in 1st group- associated with increased risk

324
Q

expected frequencies, X^2 contingency

A

P[A ∩ B] =
(row total / grand total)(column total / grand total)
E = P[A ∩ B] * grand total

325
Q

calculating X^2 contingency

A

X^2 = Σ (O–E)^2 / E = test stat
df = (#rows–1)(#columns–1)
compare to critical value

326
Q

rejecting Ho, contingency

A

Reject Ho that A and B are independent; P[A] is contingent upon B

327
Q

X^2 contingency test assumptions

A

random sample

no cells can have expected frequency <5

328
Q

if X^2 contingency test assumptions not met

A

≥2 rows/columns can be combined for larger expected frequencies

329
Q

to test independence of 2 categorical variables when expected frequencies are low

A

Fisher’s exact test

330
Q

Fisher’s exact test

A

gives exact p-value for a test of association in a 2x2 table

331
Q

Fisher’s exact test assumptions

A

random samples

332
Q

Fisher’s Ho

A

state of A and B are independent

333
Q

conduct Fisher’s

A

–list all possible 2x2 tables w/ results as or more extreme than observed table
–p-value is sum of the Pr of all extreme tables under Ho of independence
–assess null

334
Q

Computer-Intensive methods

A

cheap speed
hypothesis testing- simulation, permutation (randomization)
standard errors, CI- bootstrapping

335
Q

hypothesis testing, simulation

A

–simulates sampling process many times- generate null distribution from simulated data
–creates a ‘population’ w/ parameter values specified by Ho
–used commonly when null distr. unknown

336
Q

simulation to generate null distribution

A
  1. create and sample imaginary population w/ parameter values as specified by Ho
  2. calculate test statistic on simulated sample
  3. repeat 1&2 large number of times
  4. gather all simulated test statistic values to form null distr.
  5. compare test statistic from data to null distr. to approx. p-value and assess Ho
337
Q

generated null distribution

A

P-value ~ fraction of simulated X^2 values ≥ observed X^2

none ≥ observed, P < 0.0001

338
Q

Permutation tests (Randomization test)

A

test hypotheses of association between 2 variables; randomization done w/o replacement; needs ‘parameter’ for association btw 2 variables

339
Q

Permutation test used when

A

assumption of other methods are not met or null distribution is unknown

340
Q

Permutation steps

A
  1. Create permuted data set w/ response variable randomly shuffled w/o replacement
  2. calculate measure of association for permuted sample
  3. repeat 1&2 large number of times
  4. Gather all permuted values of test statistic to form null distribution
  5. Determine approximate P-value and assess Ho
341
Q

Bootstrapping

A

calculate SE or CI for parameter estimate
useful if no formula or if distribution unknown
randomly ‘resamples’ from the data with replacement to estimate SE or CI
ex. median

342
Q

bootstrapping steps

A
  1. random sample w/ replacement- 1st bootstrap sample
  2. calculate estimate using bootstrap sample
  3. repeat many times
  4. calculate bootstrap SE
    * only sampling from original sample values
343
Q

simulation

A

mimics repeated sampling under Ho

344
Q

permutation

A

randomly reassigns observed values for one of two variables

345
Q

bootstrapping

A

used to calculate SE by resampling from the data set

346
Q

Jack-knifing

A

leave-one-out method for calculating SE

347
Q

Jack-knifing

A

gives same result every time (unlike boot strapping)

calculates mean from n-1, then n-2, then n-3

348
Q

statistical significance

A

observed difference (effect) are not likely due to random chance

349
Q

practical significance

A

is the difference (effect) large enough to be important or of value in a practical sense

350
Q

effect size

A

ES– degree or strength of effect
ex. magnitude of relationship btw 2 variables
3 ways to quantify

351
Q

3 ways to quantify ES

A

standardized mean difference
correlation
odds-ratio

352
Q

standardized ean difference

A

Cohen’s d

353
Q

can find statistical significance

A

with a large n, which may not be large effect size, and may not be significant at lower n

354
Q

Quantifying ES

A

2% difference btw population and sample means
difficult to interpret mean differences w/o accounting for variance (s^2)
Cohen standardized ES w/ variance

355
Q

Cohen’s d

A

simplest measure of ES
difference btw means / Sp
standardizes, puts all results on same scale (makes meta-analysis possible)

356
Q

Meta-analysis

A

analysis of analysis
synthesis of multiple studies on a topic that gives an overall conclusion; increases sig. of individual studies (larger n)
black line = 1-1 line - no difference, no more, no less

357
Q

steps in meta-anlysis

A

define question to create one large study- general or specific; review literature to collect all studies- exhaustively; compute effect sizes and mean ES across al studies; look for effects of study quality

358
Q

literature search

A

beware of ‘garbage in, garbage out’, publication bias, file-drawer problem

359
Q

publication bias

A

bias- studies that weren’t published- lower n, insignificant, low effect

360
Q

garbage in, garbage out

A

justify why studies are not included, what is considered poor science?

361
Q

file-drawer problem

A

studies that are not published- grad thesis, government research

362
Q

look for effects of study quality, Meta-analysis

A

do differences in n or methodology matter

  • correlation btw n and ES?
  • difference in observ. and exp. studies?
  • base meta-analysis on higher quality studies
363
Q

pros of Meta-analysis

A

tells overall strength & variability of effect
can increase statistical power, reduce Type II error
can reveal publication bias
can reveal associations btw study type and study outcome

364
Q

cons/challenges of meta-analysis

A

assumes studies are directly comparable and unbiased samples
limited to accessible studies including necessary summary data
may have higher Type I error if publication bias is present

365
Q

what do we get out of the statistical process

A

a probability statement

this process is called Frequentist statistics, most commonly used

366
Q

What does frequentist statistics do

A
  • answer probability statements if/given the null is true
  • infer properties of a population using samples
  • doesn’t tell if null is true, not proof of anything
  • useful, but must understand so not overinterpreted
367
Q

frequentists statistics developed

A

Cohen, 1994; Null Hypothesis Sifnificance Testing

368
Q

why use frequentist statistics

A
appears to be objective and exact
readily available and easily used
everyone else uses it
scientists are taught to use it
supervisors & journals require it
369
Q

limits of frequentist statistics

A

–provides binary info only: significant or not
–does not provide means for assessing relative strength of support for alternate hypotheses
–failing to reject Ho does not mean Ho is true
–does not answer real question

370
Q

does not provide means for assessing relative strength of support for alternate hypotheses

A

ex. conclude the slope of the line is not 0, how strong is the evidence that the slope is 0.4 vs 0.5

371
Q

real question

A

whether scientific hypothesis is true or false

  • treatment has an effect (however small)
  • if so, then Ho of no effect is false, but we are unable to show that Ho is false (or true)
  • we can only show the probability of getting the data, if Ho is true
372
Q

question we CAN answer

A

about the data, not the hypothesis- given the data, how likely is Ho to be true

373
Q

more limitations for frequentist stats

A

whether a result is significant depends on n, ES, alpha

significant does not always mean important

374
Q

larger n, ES, alpha

A

increase likelihood of rejecting Ho- getting significant result

375
Q

significant does not necessarily mean important

A

effects can be tiny and still statistically significant

376
Q

focus on p-values and Ho rejection

A

distracts from the real goal- deciding whether data support scientific hypotheses and are practically/biologically important

377
Q

mostly we should be interested in

A

size/strength/direction of an effect

378
Q

Bayesian statistics

A

incorporate beliefs or knowledge of parameter values into analyses to contain population estimate

379
Q

frequentists vs. bayesian example

A

100 coin flips all give 95 heads, what is the probability that the next flip will be a head?

freq. - 50%
bay. - 95%