BIO 330 Flashcards

Question

using quadrats

Answer 1

more better | stop when mean/variance stabilize (asymptote)

Answer 2

reduces spread (narrows graph) - increases preciesion

Answer 3

s / sqr rt (n)

Answer 4

SD- spread of distribution/deviation from mean | SE- precisions of an estimate (ex. mean)

Answer 5

leptokurtic- sharper peak (+) platykurtic- rounder peak (-) mesokurtic- normal (0)

Answer 6

~2/3 of the area under the curve (2SD = 95%)

Answer 7

process/experiment with ≥2 possible outcomes who occurrence can not be predicted

Answer 8

all possible outcomes

Answer 9

any subset of the sample space (≥1 outcome)

Answer 10

P[A and B] = 0

Answer 11

P[7U11] = P[7} + P[11]

Answer 12

P[AUB] = P[A] + P[B] - P[A and B]

Answer 13

independent events | P[A and B] = P[A] x P[B]

Answer 14

P[A I B] = P[A and B] / P[B]

Answer 15

sample of convenience

Answer 16

ever unit has equal opportunity, selection of unit independent, minimizes bias, possible to measure sampling error

Answer 17

assume unbiased/independent- no guarantee

Answer 18

health conscious, low income, ill, more time, angry, less prudish

Answer 19

describes # of times each value of a variable occurs in sample

Answer 20

distribution of variable in whole population

Answer 21

of times value is observed

Answer 22

proportion of individuals which have that value

Answer 23

determine cause and effect | *cause

Answer 24

only point to cause | *correlations

Answer 25

smaller range of values (spread)

Answer 26

usually can't- don't know true value

Answer 27

it can be converted to categorical if need be

Answer 28

discrete (count)

Answer 29

continuous

Answer 30

less effected by chance lower sampling error lower bias

Answer 31

round to one decimal place more than measurement (in calculations)

Answer 32

more variability

Answer 33

p^ = # of observations in category of interest/ total # of observations in all categories

Answer 34

it is squared so that each value is +, so they don't cancel each other out n-1 to account for population bias

Answer 35

relative measures- comparing data sets

Answer 36

probability distribution of all values for an estimate that we might obtain when we sample a population, centred at true µ

Answer 37

implausible

Answer 38

till cumulative number of observations asymptotes

Answer 39

P[A] = Σ P[B].P[A I B] | for all B_i 's

Answer 40

sampling distribution for test statistic, if repeated trials many time and graphed test statistics for H_o

Answer 41

P[Reject Ho I Ho true] = alpha

Answer 42

P-vale < alpha

Answer 43

P[do not reject Ho I Ho false]

Answer 44

P[Reject Ho I Ho false] increases with large n decreases P[Type II E]

Answer 45

used to evaluate whether data are reasonably expected under Ho

Answer 46

probability of getting data as extreme or more, given Ho is true

Answer 47

data differ from H_o | not necessarily important- depends on magnitude of difference and n

Answer 48

would decrease P[Type I] but increase P[Type II]

Answer 49

ex. drawing cards | 1/52).(1/51).(1/50

Answer 50

P[A I B] = ΣP[B I A].P[A] / P[B]

Answer 51

do not reject Ho | data are consistent with Ho

Answer 52

how many sd's Y is from µ

Answer 53

Ybar - µ / (s / sq.rt. n)

Answer 54

Ybar ± SE.tcrit SE of Ybar t of alpha(1 or 2), degrees of freedom

Answer 55

compares sample mean from normal pop. to population µ proposed by Ho

Answer 56

last value is not free to vary if mean is a specified value

Answer 57

data are a random sample | variable is normally distributed in pop.

Answer 58

pairs are a random sample from pop. | paired differences are normally distributed in the pop.

Answer 59

if test statistic is further into tails than critical t then reject

Answer 60

treatment vs. control

Answer 61

both samples are random samples variable is normally distributed in each group standard deviation in two groups ± equal

Answer 62

1 sample t-test: n - 1 paired t-test: n - 1 2 sample t-test: n1 + n2 - 2

Answer 63

mask/distort causal relationships btw measured variables problem w/ observational studies impossible to differentiate 1 variable

Answer 64

bias resulting from experiment, unnatural conditions problem w/ experimental studies should try to mimic natural environment

Answer 65

knowledge of initial/natural conditions via preliminary data to ID hypotheses and confounding variables controls to reduce bias replication to reduce sampling error

Answer 66

develop clear statement of research question list possible outcomes develop experimental plan check for design problems

Answer 67

ID question, Ho, Ha choose factors, response variable what is being testes? will the experiment actually test this?

Answer 68

ID sample space explain how each outcome supports/refutes Ho consider external risk factors

Answer 69

outline different experimental designs | check literature for existing/accepted designs

Answer 70

what kind of data will you have- aim for numerical | what type of statistical test will you use

Answer 71

control group randomization blinding

Answer 72

replication balance blocking

Answer 73

positive | negative

Answer 74

treatment that should produce obvious, strong effect | ensuring experiment design doesn't block effect

Answer 75

subjects go through all same steps but do not receive treatment- no effect

Answer 76

add controls w/o reducing sample size- too many controls samples using up resources will reduce power

Answer 77

improvement in condition from psychological effect

Answer 78

breaks correlation btw explanatory variable and confounding variables (averages effects of confounding variables)

Answer 79

conceals from subjects/researchers which treatment was received prevent conscious/unconscious changes in behaviour single blind or double blind

Answer 80

sample error/noise is minimized

Answer 81

smaller SE, tighter CI

Answer 82

``` each sample is correlated w/ sample area not independent (unless testing differences in that population) ```

Answer 83

measurement at one pt in time is directly correlated w/ the one before/after it

Answer 84

small SE, narrow CI

Answer 85

accounts for extraneous variation by putting experimental units that are similar into 'blocks' only concerned w/ differences within block- differences btw blocks don't matter lowers noise

Answer 86

most powerful study design study multiple treatments and their interactions equal replication of all combinations of treatment

Answer 87

check degrees of freedom, very large- problem | overestimate = easier to reject Ho- pretending we have more power than we do

Answer 88

precision, power, data loss

Answer 89

want low CI n ~ 8(sigma/uncertainty)^2 uncertainty is 1/2 CI

Answer 90

``` detecting effect/difference plan for probability of rejecting a false Ho n~16(sigma/D)^2 D is min. effect size you want to detect power is 0.8 ```

Answer 91

avoid trivial experiment collaborate to streamline efforts substitute models for live animals when possible keep encounters brief to reduce stress

Answer 92

check common design problems sample size (precision,power,data loss) get a second opinion

Answer 93

keep track of confounding variables

Answer 94

QQ plot | compares data w/ standardized value, should follow a straight line

Answer 95

above line (more positive data)

Answer 96

works like Hypothesis test, Ho: data normal estimate pop mean and SD using sample data, tests match to normal distribution with same mean and SD p-value < alpha, reject Ho (don't want to reject)

Answer 97

Histogram QQ plot Shapiro-Wilk

Answer 98

especially to outliers, over-rejection rate sensitive to sample size large n = more power

Answer 99

Levene's test

Answer 100

Ho: sigma1 = sigma2 difference btw each data point and mean, test difference btw groups in the means of these differences p-value < alpha reject (don't want to reject)

Answer 101

ignore it transform data use nonparametric test use permutation test

Answer 102

CLT- n >30 ----means are ~normally distributed depends on data set though can't ignore normality and compare one set skewed left with one skewed right

Answer 103

n large, n1 ~ n2 | 3 fold difference in SD usually ok

Answer 104

Welch's t-test- computes SE and df differently

Answer 105

log, arcsine, square-root | log- only in data all > 0

Answer 106

assume less about underlying distributions usually based on rank data Ho: ranks are same btw groups sign test (instead of t test)

Answer 107

compares median to median in Ho | each data pt- record whether above (+) or below (-) the Ho median

Answer 108

half data will be above Ho, half will be below

Answer 109

use binomial distribution-- probability of getting your measurement if Ho true, compare to alpha

Answer 110

P[Y≤y] = Σ(n choose y)(p)^y(1-p)^n-y

Answer 111

compare 2 groups using ranks doesn't assume normality assumes distributions are same shape rank all data from both groups together, sum ranks for individual groups

Answer 112

``` U1 = n1n2 + [(n1(n1+1)/2] - R1 U2 = n1n2 - U1 ```

Answer 113

choose larger of U1, U2 (test statistics)- compare to critical U from U distribution (table E) note that Ucrit = U_alpha,(2 sided), n1, n2 used n1, n2 not DF U < Ucrit d.n.r. Ho (2 groups not statistically different)

Answer 114

not looking at estimating mean/variance, just comparing the shapes

Answer 115

low power- P[Type II] higher-- especially with low n ranking data = major info loss avoid use Type I not altered

Answer 116

ANOVA - analysis of variance | Ho: µ1 = µ2 = µ3 = µ4....

Answer 117

multiple t-tests to compare >2 groups increase Type I error- more tests = higher chance of falling within alpha

Answer 118

1 - ( 1 - alpha ) ^N N is number of t-tests you do ex. 5 groups- 10 unique tests- P[TI] = 0.4

Answer 119

is there more variation btw groups than can be attributed to chance- breaks it down into: total variation, btw group variation, within group variation maintains P[TI] = alpha

Answer 120

effect of interest (signal)

Answer 121

sampling error (noise)

Answer 122

take 2 different variables-- look at all combinations and see if any effects between them in all directions 2 variables w/controls = 8 options

Answer 123

State Ho, Ha calculate test statistic determine critical value of null distribution (or P-value) compare tests statistic to critical value (or P-value to sig. level) evaluate Ho using alpha

Answer 124

balances Type I error and Type II error

Answer 125

we don't know whether or not Ho is actually true

Answer 126

data analysis stage, doesn't happen at data collection stage (subsamples)

Answer 127

P[Type I Error] = alpha

Answer 128

grand mean, main horizontal line, test for differences between grand mean and group means

Answer 129

= MS groups; same variation within and btw goups

Answer 130

more variation between groups than within

Answer 131

F-distribution, F_0.05,(1),MSgroup DF, MSerror DF = critical value compare critical value to F-ratio this is a one sided distribution we are looking for whether F-ratio is bigger than critical value (strictly)

Answer 132

Reject Ho.. at least one group mean is different than the others

Answer 133

R^2 = SSgroups/SStotal | R^2 [0,1]

Answer 134

more of the variation can be explained by the treatment, usually want at least 0.5

Answer 135

43% of total variation is explained by differences in treatment

Answer 136

noisy data

Answer 137

Random samples from populations Variable is normally distributed in each k population Equal variance in all k populations

Answer 138

large n, similar variances-- ignore variances very different-- transform non-parametric-- Kruskal-Wallis

Answer 139

Planned or Unplanned comparison of means

Answer 140

comparison between means planned during study design, before data is obtained; for comparing ONE group w/ control (only 2 means); not common

Answer 141

comparisons to determine differences between all pairs of mean; more common; controls Type I error

Answer 142

``` like a 2-sample t-test test statistic: t =(Ybar1 - Ybar2)/SE SE= √ MSerror (1/n1 + 1/n2) note that we use error mean square instead of pooled variance (as in a normal t-test) df = N-k t critical= t0.05(2), df ```

Answer 143

Tukey-Kramer

Answer 144

determines what kind of statistical test you an do

Answer 145

mean < median | skew 'pulls' mean in direction of skew

Answer 146

95% CI: a < µ < b (units)

Answer 147

NEVER!!! | only REJECT or FAIL TO REJECT

Answer 148

it balances TIE and TIIE which are actually conceptual, since we don't know if Ho is actually true or not

Answer 149

standard deviation of its sampling distribution; measures precision of the estimate

Answer 150

SD- SPREAD of a distribution, deviation from mean | SE- PRECISION of an estimate; SD of sampling distribution

Answer 151

used to evaluate whether the data is reasonably expected under the Ho

Answer 152

probability of getting the data, or something more unusual, given Ho is true

Answer 153

p-value ≤ alpha less than OR equal to 0.049, 0.05

Answer 154

1. State Ho and Ha 2. Calculate test statistic 3. Determine critical value or P-value 4. Compare test statistic to critical value 5. Evaluate Ho using sig. level (and interpret)

Answer 155

Reject Ho, given Ho true

Answer 156

Do not reject Ho, given Ho is false

Answer 157

P[Type I] decreases, P[Type II] increases

Answer 158

1. Develop clear statement of research question 2. List possible outcomes 3. Develop experimental plan 4. Check for design problems

Answer 159

control group, randomization, blinding

Answer 160

replication- lare n lowers noise balance- lowers noise blocking

Answer 161

check df- obviously if its huge something is wrong

Answer 162

for 3 means: three Y bars, three Ho's; Q distribution; 3 row table w/ group i, group y, difference in means, SE, test statistic, critical q, outcome (reject/do not)

Answer 163

symmetrical, uses larger critical values to restrict Type I error; more difficult to reject null

Answer 164

``` q = Y_i(bar) - Y_j(bar) / SE SE = √ MSerror(1/n1 + 1/n2) ```

Answer 165

test statistic, q-value critical value, q_α,k,N-k k = # groups N = total # observations

Answer 166

random samples data normally distributed in each group equal variances in all groups

Answer 167

2 Factors = 3 Ho's: difference in 1 factor, difference in 2nd factor, difference in interaction

Answer 168

do not conclude that factor is not

Answer 169

y-axis: response variable x-axis: one of 2 main factors legend for: other of 2 main factors (different symbols or colors) 2 lines

Answer 170

lines parallel: no significance in interaction

Answer 171

take average along each line and compare the 2 on the y-axis, if they are not close then they are significant

Answer 172

x-axis: take average between the 2 dots (for each level of a), compare on y-axis, if they are not close they are significant

Answer 173

reduce bias | will not affect sampling error

Answer 174

"r"- comparing 2 numerical variables, [-1,1], no units, always linear quantify strength and direction of LINEAR relationship (+/-)

Answer 175

``` r = signal/noise signal= deviation in x and y together for every point (multiply each deviation before summing) ```

Answer 176

no correlation between interbreeding and number of pup surviving their first winter (ρ = 0)

Answer 177

``` test statistic: r/SE_r SE_r = √ (1-r^2) / (n-2) df = n-2 critical: tα,(2),df compare statistic w/ critical ```

Answer 178

n - number of parameters you estimate correlation- you estimate 2 mann whitney- 0 parameters

Answer 179

be careful not to interpret-- no causation!

Answer 180

easy to understand because of lack of units, however, can trick you into thinking comparable across studies- across studies need to limit ranges

Answer 181

if x or y are measured with error, r will be lower; with increasing error, r is underestimated; avoided by taking means of subsamples

Answer 182

statistically sig. relationships can be weak, moderate, strong sig.– probability, if Ho is true correlation– direction, strength of linear relationship

Answer 183

``` r = ±0.2 – weak r = ±0.5 – moderate r = ±0.8 – strong ```

Answer 184

bivariate normality- x and y are normal | relationship is linear

Answer 185

histograms transformations in one or both variables remove outlier

Answer 186

–need justification (i.e. data error) –carefully consider if variation is natural –conduct analyses w/ and w/o outlier to assess effect of removal

Answer 187

is your n big enough to detect if that is natural variation in the data

Answer 188

may as well leave it in!

Answer 189

Spearman's rank correlation; strength and direction of linear association btw ranks of 2 variables; useful for outlier data

Answer 190

random sampling | linear relationship between ranks

Answer 191

r_s: same structure as Pearson's correlation but based on ranks r_s = [Σ(Ri-Rbar)(Si-Sbar)] / [ Σ(Ri-Rbar)^2Σ(Si-Sbar)^2 ]

Answer 192

rank x and y values separately; each data point will have 2 ranks; sum ranks for each variable; n = # data pts.; divide each rank sum by n to get Rbar and Sbar; calculate r_s (statistic); calculate critical r_s(0.05,df)

Answer 193

average of that rank and skip rank before/after; w/o any ties, the 2 values on the bottom of r_s equation will be the same

Answer 194

ρ_s = 0, correlation = 0

Answer 195

df = n because no estimations are being made in ranking

Answer 196

–relationship between x and y described by a line –line can predict y from –line indicates rate of change of y with x Y = a + bX

Answer 197

regression assumes x,y relationship can be described by a line that predicts y from x corr. - is there a relationship reg. - can we predict y from x

Answer 198

r = 1, all points are exactly on the line– regression line fitted to that 'line' could be the exact same line for a non-perfect correlation

Answer 199

DO NOT; 4.5 puppies is a valid answer

Answer 200

minimizes SS = least squares regression; smaller sum of square deviations

Answer 201

difference between actual Y value and predicted values for Y (the line); measure scatter above/below the line

Answer 202

calculate slope using b = formula; find a– a = Ybar - bXbar; plug in to Ybar = a + bXbar; rewrite as Y = a + bX; rewrite using words

Answer 203

predicted value- if you are trying to predict a y value after equation has been solved

Answer 204

line of fit always goes through Xbar, Ybar

Answer 205

MSresiduals = Σ(Yi - Yhat)^2 / n-2 which is SSresidual / n-2 quantifies fit of line- smaller is better

Answer 206

precision of predicted mean Y for a given X | precision of predicted single Y for a given X

Answer 207

narrowest near mean of X, and flare outward from there; confidence band– most confident in prediction about the mean

Answer 208

much wider because predicting a single Y from X is more uncertain than predicting the mean Y for that X

Answer 209

DO NOT extrapolate beyond data, can't assume relationship continues to be linear

Answer 210

Slope is zero (β = 0), number of dees cannot be predicted from predator mass

Answer 211

slope is not zero (β ≠ 0), number of dees can be predicted from predator mass (2 sided)

Answer 212

testing about the slope: –t-test approach –ANOVA approac

Answer 213

Dee rate = 3.4 - 1.04(predator mass) | Number of dees decreases by about 1 pre kilo of predator mass increase

Answer 214

``` test statistic t = b–β_o / SE_b SE_b = √MSresidual/Σ(Xi-Xbar)^2 MSres. = Σ(Yi-Yhat)^2 / n-2 critical t = t_α(2),df df = n - 2 compare statistic, critical ```

Answer 215

source of variation: regression, residual, total | sum of squares, df, mean squares, F-ratio

Answer 216

``` SSregres = Σ(Yi^ - Ybar)^2 SSresid. = Σ(Yi-Yi^)^2 MSreg. = SSreg/df df=1 MSresid = SSres/df df=n-2 F-ratio = MSreg/MSres. SStotal = Σ(Yi-Ybar)^2 df total = n-1 ```

Answer 217

If Ho is true, MSreg. = MSres

Answer 218

R^2 = SSreg/SStotal | a% of variation in Y can be predicted by X

Answer 219

create non-nomral Y-value distribution, violate assumption of equal variance in Y, strong effect on slope and intercept; try not to transform data

Answer 220

linear relationship normality of Y at each X variance of Y same for every X random sampling of Y's

Answer 221

look at the scatter plot, look at residual plot

Answer 222

should be symmetric above/below zero should be more points close line (0) than far equal variance at all values of x

Answer 223

when relationship is not linear, transformations don't work, many options- aim for simplicity

Answer 224

Y = a + bX + cX^2 when c is negative, curve is humped when c is positive, curve is u shaped

Answer 225

improve detection of treatment effects investigate effects of ≥2 treatments + interactions adjust for confounding variables when comparing ≥2 groups

Answer 226

general linear model; multiple explanatory variables can be included (even categorical); response variable (Y) = linear model + error

Answer 227

``` Y = a + bX error = residuals ```

Answer 228

``` Y = µ + A error = variability within groups µ = grand mean ```

Answer 229

Ho: response = constant; response is same among treatments Ha: response = constant + explanatory variable

Answer 230

constant = intercept or grand mean

Answer 231

variable = variable x coefficient

Answer 232

source of variation: Companion, Residual, Total | SS, df, MS, F, P

Answer 233

MScomp. / MSres.

Answer 234

R^2 = SScom. / SStot. | % of variation that is explained

Answer 235

Model with treatment variable fits the data better than the null model but only 25% of the variation is explained

Answer 236

improve detection of treatment effects adjust for effects of confounding variables investigate multiple variables and their interaction

Answer 237

covariates

Answer 238

factorial design

Answer 239

account for extraneous variation by putting experimental units into blocks that share common features ex. instead of comparing randomly dispersed diversity, look at response variable within a block

Answer 240

Ho: mean prey diversity is same in every fish abundance treatment Ho: Diversity = grand mean + block Ha: mean prey diversity is not the same in every fish abundance treatment Ha: diversity = grand mean + block + fish abundance

Answer 241

source of var.: block, abundance, residual, total | SS, df, MS, F, P

Answer 242

Ho: mean prey diversity is the same in each block Ha: mean prey diversity is not the same in each block Block R^2 = SSblock / SStotal Abundance + block R^2 = SSabun. + SSblock / SStotal

Answer 243

block is an explanatory variable even if we are not inherently interested in its effect b/c it contributes to variation

Answer 244

reduce confounding variables, reduce bias

Answer 245

Response = constant + explanatory + covariate

Answer 246

Ho:No interaction between caste and body mass Response = constant + exp. + covariate Ha: Interaction between caste and body mass Response = cons. + exp + cov. + explanatory*covariate

Answer 247

Ho: parallel Ha: not parallel affect is measured as the vertical difference between the two lines

Answer 248

are the slopes equal | if not significant, drop interaction term and run model again

Answer 249

df_covariant * df_explanatory

Answer 250

multiple explanatory variables | fully factorial- every level of every variable and interaction is studied

Answer 251

Ha: algal cover = grand mean + herbivory + height + herbivory*height Ho: a.c. = G.M. + Herb. + Height

Answer 252

do not include interaction statements | always one term different from alternative

Answer 253

``` explanatory: df = levels of treatment - 1 interaction: df = df_exp.1 * df_exp.2 df always total to grand n - 1 ```

Answer 254

Ho: no interaction = parallel lines Ha: interaction = non parallel, maybe crossing lines

Answer 255

P[X] = P[A]*P[B]*P[C]*.... | if multiple ways to arrive at P[X] then add them up, or use Binomial (if conditions met)

Answer 256

probability distribution for # of successes in a fixed n of independent trials

Answer 257

independent probability of success is same for each trial 2 possible outcomes- success/failure

Answer 258

``` p^ = X/n SE_p^ = √ [p^ (1-p^)] / [n–1] ```

Answer 259

whether relative frequency of successes in a population matches null expectation Ho: p = p_o

Answer 260

higher n = better estimate of p (or any estimate for that matter), lower SE

Answer 261

test statistic = observed number of successes | null expectation = null 'p' * number of 'trials' (weighted by trials)

Answer 262

use null 'p' in binomial to calculate observed successes + anything more extreme; multiply by 2 (2 sided test)- this is the p-value; not comparing to critical value; compare to alpha

Answer 263

reject Ho, p^ is significantly different than Ho: p = under a proportional model

Answer 264

p' = ( X + 2 ) / ( n + 4 ) p' ± Z √ [p' (1–p')] / [n+4] Z = 1.96 for 95% CI

Answer 265

X^2 goodness-of-fit test | compare frequency data w/ >2 possible outcomes to frequencies expected from probability model in Ho

Answer 266

categorical data | space between bars

Answer 267

Ho: # of births is the same on each day | births on Monday is proportional to # of Mondays in the year

Answer 268

test statistic measures discrepancy btw observed (data) and expected (Ho) frequencies

Answer 269

``` find E for each group, then X^2 for each group, sum X^2 = test statistic, compare to critical value E = n*p X^2 = Σ (O – E)^2 / E df = # categories – 1 critical X^2_α,df ```

Answer 270

Histogram- sampling distribution for all possible values for X^2 black line- theoretical X^2 probability distribution

Answer 271

observed farther from expected

Answer 272

using n to calculate expected value- restricts data

Answer 273

data do not fit a proportional model, births are not equally distributed through the week

Answer 274

random sample no category has expected frequency > 1 no more than 20% of the categories have expected frequencies < 5

Answer 275

describes probability of success in a block of time or space, when successes happen independently and with equal probability

Answer 276

clumped random dispersed

Answer 277

``` E = e^-µ . µ^x / X! µ = mean # of independent successes ```

Answer 278

Ho: number of extinctions per time interval has a Poisson distribution Ha: number of extinctions do not follow a Poisson distribution

Answer 279

µ = (n1*f1)+(n2*f2)+(n3*f3)+.... / n

Answer 280

calculate probability of success (expected value) for each level; calculate X^2 for each level, sum them; compare to critical value df = # categories - 1

Answer 281

s^2 = [ Σ (Xi - µ)^2 * (obs. frequency)] / (n–1) clumped: s^2 > µ dispersed: s^2 < µ

Answer 282

proportional binomial poisson

Answer 283

probability of success is not same in all trials or trials are not independent

Answer 284

successes are not independent, probability of success is not constant over time or space

Answer 285

``` whether one variable depends on the other (is contingent on) in a contingency table explanatory variable in columns response variable in row each subject appears in table once ```

Answer 286

no relationship between variables, variables independent

Answer 287

test for association between ≥2 categorical variables are categorical variables independent odds ratio X^2 contingency test

Answer 288

to measure magnitude of association between 2 variables when each has only 2 categories odds: O^ = p^ / 1–p^ odds ratio: OR = O1^ / O2^

Answer 289

to test whether the 2 variables are independent; to test association between 2 categorical variables; need expected frequencies for each cell under Ho

Answer 290

OR=1 : odds same for both groups | OR>1 : odds higher in 1st group- associated with increased risk

Answer 291

P[A ∩ B] = (row total / grand total)(column total / grand total) E = P[A ∩ B] * grand total

Answer 292

X^2 = Σ (O–E)^2 / E = test stat df = (#rows–1)(#columns–1) compare to critical value

Answer 293

Reject Ho that A and B are independent; P[A] is contingent upon B

Answer 294

random sample | no cells can have expected frequency <5

Answer 295

≥2 rows/columns can be combined for larger expected frequencies

Answer 296

Fisher's exact test

Answer 297

gives exact p-value for a test of association in a 2x2 table

Answer 298

random samples

Answer 299

state of A and B are independent

Answer 300

–list all possible 2x2 tables w/ results as or more extreme than observed table –p-value is sum of the Pr of all extreme tables under Ho of independence –assess null

Answer 301

cheap speed hypothesis testing- simulation, permutation (randomization) standard errors, CI- bootstrapping

Answer 302

–simulates sampling process many times- generate null distribution from simulated data –creates a 'population' w/ parameter values specified by Ho –used commonly when null distr. unknown

Answer 303

1. create and sample imaginary population w/ parameter values as specified by Ho 2. calculate test statistic on simulated sample 3. repeat 1&2 large number of times 4. gather all simulated test statistic values to form null distr. 5. compare test statistic from data to null distr. to approx. p-value and assess Ho

Answer 304

P-value ~ fraction of simulated X^2 values ≥ observed X^2 | none ≥ observed, P < 0.0001

Answer 305

test hypotheses of association between 2 variables; randomization done w/o replacement; needs 'parameter' for association btw 2 variables

Answer 306

assumption of other methods are not met or null distribution is unknown

Answer 307

1. Create permuted data set w/ response variable randomly shuffled w/o replacement 2. calculate measure of association for permuted sample 3. repeat 1&2 large number of times 4. Gather all permuted values of test statistic to form null distribution 5. Determine approximate P-value and assess Ho

Answer 308

calculate SE or CI for parameter estimate useful if no formula or if distribution unknown randomly 'resamples' from the data with replacement to estimate SE or CI ex. median

Answer 309

1. random sample w/ replacement- 1st bootstrap sample 2. calculate estimate using bootstrap sample 3. repeat many times 4. calculate bootstrap SE * only sampling from original sample values

Answer 310

mimics repeated sampling under Ho

Answer 311

randomly reassigns observed values for one of two variables

Answer 312

used to calculate SE by resampling from the data set

Answer 313

leave-one-out method for calculating SE

Answer 314

gives same result every time (unlike boot strapping) | calculates mean from n-1, then n-2, then n-3

Answer 315

observed difference (effect) are not likely due to random chance

Answer 316

is the difference (effect) large enough to be important or of value in a practical sense

Answer 317

ES– degree or strength of effect ex. magnitude of relationship btw 2 variables 3 ways to quantify

Answer 318

standardized mean difference correlation odds-ratio

Answer 319

with a large n, which may not be large effect size, and may not be significant at lower n

Answer 320

2% difference btw population and sample means difficult to interpret mean differences w/o accounting for variance (s^2) Cohen standardized ES w/ variance

Answer 321

simplest measure of ES difference btw means / Sp standardizes, puts all results on same scale (makes meta-analysis possible)

Answer 322

analysis of analysis synthesis of multiple studies on a topic that gives an overall conclusion; increases sig. of individual studies (larger n) black line = 1-1 line - no difference, no more, no less

Answer 323

define question to create one large study- general or specific; review literature to collect all studies- exhaustively; compute effect sizes and mean ES across al studies; look for effects of study quality

Answer 324

beware of 'garbage in, garbage out', publication bias, file-drawer problem

Answer 325

bias- studies that weren't published- lower n, insignificant, low effect

Answer 326

justify why studies are not included, what is considered poor science?

Answer 327

studies that are not published- grad thesis, government research

Answer 328

do differences in n or methodology matter - correlation btw n and ES? - difference in observ. and exp. studies? - base meta-analysis on higher quality studies

Answer 329

tells overall strength & variability of effect can increase statistical power, reduce Type II error can reveal publication bias can reveal associations btw study type and study outcome

Answer 330

assumes studies are directly comparable and unbiased samples limited to accessible studies including necessary summary data may have higher Type I error if publication bias is present

Answer 331

a probability statement | this process is called Frequentist statistics, most commonly used

Answer 332

- answer probability statements if/given the null is true - infer properties of a population using samples - doesn't tell if null is true, not proof of anything - useful, but must understand so not overinterpreted

Answer 333

Cohen, 1994; Null Hypothesis Sifnificance Testing

Answer 334

``` appears to be objective and exact readily available and easily used everyone else uses it scientists are taught to use it supervisors & journals require it ```

Answer 335

–provides binary info only: significant or not –does not provide means for assessing relative strength of support for alternate hypotheses –failing to reject Ho does not mean Ho is true –does not answer real question

Answer 336

ex. conclude the slope of the line is not 0, how strong is the evidence that the slope is 0.4 vs 0.5

Answer 337

whether scientific hypothesis is true or false - treatment has an effect (however small) - if so, then Ho of no effect is false, but we are unable to show that Ho is false (or true) - we can only show the probability of getting the data, if Ho is true

Answer 338

about the data, not the hypothesis- given the data, how likely is Ho to be true

Answer 339

whether a result is significant depends on n, ES, alpha | significant does not always mean important

Answer 340

increase likelihood of rejecting Ho- getting significant result

Answer 341

effects can be tiny and still statistically significant

Answer 342

distracts from the real goal- deciding whether data support scientific hypotheses and are practically/biologically important

Answer 343

size/strength/direction of an effect

Answer 344

incorporate beliefs or knowledge of parameter values into analyses to contain population estimate

Answer 345

100 coin flips all give 95 heads, what is the probability that the next flip will be a head? freq. - 50% bay. - 95%

BIO 330 Flashcards

(379 cards)