Flashcards in BIO 330 Deck (379)

Loading flashcards...

211

## understanding r

### easy to understand because of lack of units, however, can trick you into thinking comparable across studies- across studies need to limit ranges

212

## Attenuation bias

### if x or y are measured with error, r will be lower; with increasing error, r is underestimated; avoided by taking means of subsamples

213

## correlation and significance

###
statistically sig. relationships can be weak, moderate, strong

sig.– probability, if Ho is true

correlation– direction, strength of linear relationship

214

## weak, moderate, strong correlation

###
r = ±0.2 – weak

r = ±0.5 – moderate

r = ±0.8 – strong

215

## correlation assumptions

###
bivariate normality- x and y are normal

relationship is linear

216

## dealing with assumption violations (correlation)

###
histograms

transformations in one or both variables

remove outlier

217

## outlier removal

###
–need justification (i.e. data error)

–carefully consider if variation is natural

–conduct analyses w/ and w/o outlier to assess effect of removal

218

## natural variation, outliers

### is your n big enough to detect if that is natural variation in the data

219

## if outlier removal has no effect

### may as well leave it in!

220

## non-parametric Correlation

### Spearman's rank correlation; strength and direction of linear association btw ranks of 2 variables; useful for outlier data

221

## Spearman's rank correlation assumptions

###
random sampling

linear relationship between ranks

222

## Spearman's rank correlation

###
r_s: same structure as Pearson's correlation but based on ranks

r_s = [Σ(Ri-Rbar)(Si-Sbar)] / [ Σ(Ri-Rbar)^2Σ(Si-Sbar)^2 ]

223

## conducting Spearmans

### rank x and y values separately; each data point will have 2 ranks; sum ranks for each variable; n = # data pts.; divide each rank sum by n to get Rbar and Sbar; calculate r_s (statistic); calculate critical r_s(0.05,df)

224

## if 2 points have same rank (Spearman)

### average of that rank and skip rank before/after; w/o any ties, the 2 values on the bottom of r_s equation will be the same

225

## Spearman hypothesis

### ρ_s = 0, correlation = 0

226

## Spearman df

### df = n because no estimations are being made in ranking

227

## linear regression

###
–relationship between x and y described by a line

–line can predict y from

–line indicates rate of change of y with x

Y = a + bX

228

## correlation vs. regression

###
regression assumes x,y relationship can be described by a line that predicts y from x

corr.- is there a relationship

reg.- can we predict y from x

229

## perfect correlation

### r = 1, all points are exactly on the line– regression line fitted to that 'line' could be the exact same line for a non-perfect correlation

230

## rounding mean results

### DO NOT; 4.5 puppies is a valid answer

231

## best line of fit

### minimizes SS = least squares regression; smaller sum of square deviations

232

## used for evaluating fit of the line to the data

### residuals

233

## residuals

### difference between actual Y value and predicted values for Y (the line); measure scatter above/below the line

234

## calculating linear regression

### calculate slope using b = formula; find a– a = Ybar - bXbar; plug in to Ybar = a + bXbar; rewrite as Y = a + bX; rewrite using words

235

## Yhat

### predicted value- if you are trying to predict a y value after equation has been solved

236

## why do we solve linear regression with Xbar, Ybar

### line of fit always goes through Xbar, Ybar

237

## how good is line of fit

###
MSresiduals = Σ(Yi - Yhat)^2 / n-2

which is SSresidual / n-2

quantifies fit of line- smaller is better

238

## Prediction confidence, linear regression

###
precision of predicted mean Y for a given X

precision of predicted single Y for a given X

239

## Precision of predicted mean Y for a given X, linear regression

### narrowest near mean of X, and flare outward from there; confidence band– most confident in prediction about the mean

240