Flashcards in BIO 330 Deck (379)
Loading flashcards...
211
understanding r
easy to understand because of lack of units, however, can trick you into thinking comparable across studies- across studies need to limit ranges
212
Attenuation bias
if x or y are measured with error, r will be lower; with increasing error, r is underestimated; avoided by taking means of subsamples
213
correlation and significance
statistically sig. relationships can be weak, moderate, strong
sig.– probability, if Ho is true
correlation– direction, strength of linear relationship
214
weak, moderate, strong correlation
r = ±0.2 – weak
r = ±0.5 – moderate
r = ±0.8 – strong
215
correlation assumptions
bivariate normality- x and y are normal
relationship is linear
216
dealing with assumption violations (correlation)
histograms
transformations in one or both variables
remove outlier
217
outlier removal
–need justification (i.e. data error)
–carefully consider if variation is natural
–conduct analyses w/ and w/o outlier to assess effect of removal
218
natural variation, outliers
is your n big enough to detect if that is natural variation in the data
219
if outlier removal has no effect
may as well leave it in!
220
non-parametric Correlation
Spearman's rank correlation; strength and direction of linear association btw ranks of 2 variables; useful for outlier data
221
Spearman's rank correlation assumptions
random sampling
linear relationship between ranks
222
Spearman's rank correlation
r_s: same structure as Pearson's correlation but based on ranks
r_s = [Σ(Ri-Rbar)(Si-Sbar)] / [ Σ(Ri-Rbar)^2Σ(Si-Sbar)^2 ]
223
conducting Spearmans
rank x and y values separately; each data point will have 2 ranks; sum ranks for each variable; n = # data pts.; divide each rank sum by n to get Rbar and Sbar; calculate r_s (statistic); calculate critical r_s(0.05,df)
224
if 2 points have same rank (Spearman)
average of that rank and skip rank before/after; w/o any ties, the 2 values on the bottom of r_s equation will be the same
225
Spearman hypothesis
ρ_s = 0, correlation = 0
226
Spearman df
df = n because no estimations are being made in ranking
227
linear regression
–relationship between x and y described by a line
–line can predict y from
–line indicates rate of change of y with x
Y = a + bX
228
correlation vs. regression
regression assumes x,y relationship can be described by a line that predicts y from x
corr.- is there a relationship
reg.- can we predict y from x
229
perfect correlation
r = 1, all points are exactly on the line– regression line fitted to that 'line' could be the exact same line for a non-perfect correlation
230
rounding mean results
DO NOT; 4.5 puppies is a valid answer
231
best line of fit
minimizes SS = least squares regression; smaller sum of square deviations
232
used for evaluating fit of the line to the data
residuals
233
residuals
difference between actual Y value and predicted values for Y (the line); measure scatter above/below the line
234
calculating linear regression
calculate slope using b = formula; find a– a = Ybar - bXbar; plug in to Ybar = a + bXbar; rewrite as Y = a + bX; rewrite using words
235
Yhat
predicted value- if you are trying to predict a y value after equation has been solved
236
why do we solve linear regression with Xbar, Ybar
line of fit always goes through Xbar, Ybar
237
how good is line of fit
MSresiduals = Σ(Yi - Yhat)^2 / n-2
which is SSresidual / n-2
quantifies fit of line- smaller is better
238
Prediction confidence, linear regression
precision of predicted mean Y for a given X
precision of predicted single Y for a given X
239
Precision of predicted mean Y for a given X, linear regression
narrowest near mean of X, and flare outward from there; confidence band– most confident in prediction about the mean
240