BIO 300 Lab Quiz 3 Flashcards Preview

Biology > BIO 300 Lab Quiz 3 > Flashcards

Flashcards in BIO 300 Lab Quiz 3 Deck (38):
1

correlation

strength of a linear association between 2 numerical variables

2

correlation uses

correlation coefficient

3

correlation coefficient

r
[-1,1]
unitless

4

-r

as one variable increases the other decreases

5

inferences from correlation

cannot infer causality

6

regression

implies causality between 2 variables
used to predict value of response variable from explanatory variable
can determine how much of variability is due to relationship w/ explanatory variable

7

regression statistic

R^2 = SS_regression / SS_total
[0,1]

8

linear regression assumptions

-relationship between response (Y) and explanatory (X) is linear
-Y values at each value of X are normally distributed
-variance of Y values is same at all values of X
-Y measurements are sampled randomly from the population at each value of X

9

are there any outliers

yes, can be seen by boxplot

10

how to do regression with multiple groups (N/S)

"regression with groups"
then add total regression line by right clicking then going to regression fit

11

do the data need to be transformed

are the data clumped in one corner of the scatterplot
is there greater spread in one section of the scatterplot
are there different orders of magnitude spanned in the variables

12

how to SLR

stat-- regression-- regression-- fit regression model

13

options to check for regression

responses- Chl-a
continuous predictor- Log P
graphs- residuals vs. fits
results- everything but Durbin-Watson
storage- residuals

14

SSregression

proportion of variation in response variable accounted for by the regression

15

SSresidual

proportion of variation unexplained by regression

16

MS

measure of variance, average of sums of squares: SS/df

17

F

F-ratio
MSregression/MSresidual

18

constant

y-intercept
note that this value will appear in the equation

19

log P constant

the slope
will appear in equation

20

better predictor of response variable

higher R^2

21

multiple regression

stat-- regression-- fit regression model
responses- chl-a
continuous predictors- Log P, Log N

22

dont forget

to label residual columns
check for normality
check for equal variance
check if assumptions are met
test residuals for normality

23

when R^2 SLR ~ R^2 MR

possibly 2 explanatory variables are correlated

24

if two explanatory variables are correlated

co-linearity

25

why did Log N lose its significance in MR

the variation explained by Log N is already accounted for by Log P.. not much variation left that Log N can describe

26

test for correlation

stat-- basic stat-- correlation-- variables: logP, log N--- ok

27

stepwise multiple regression

looks at all combinations of explanatory variables to retain the ones that explain the most variation
eliminates explanatory variables that do not add any new explanatory power

28

do you need different predictive equations for the 2 sampling locations

are the intercepts and slopes of the regression equations significantly different?

29

test for significant differences in y-intercept of regression lines

ANCOVA

30

ANCOVA

analysis of covariance

31

ANCOVA assumes

equal slopes (parallel lines)

32

testing for equal slopes

are the lines parallel? if they are then interactions are not significant
we'll assume that they are

33

if y-intercepts are not significantly different (p-value ≥ alpha)

free to use one regression equation for both locations

34

running an ANCOVA

stat-- anova-- GLM-- fit general linear model

35

options in ANCOVA

responses-- chl-a
factors- location
covariates- log P
model- location, log P, location*log P
results: only ANOVA
storage: residuals

36

ANCOVA output

-if interaction is not significant re-build model without interaction
-determine if effect of sampling location is important- decide if you need 2 equations or not

37

if sampling location has significant effect

need to 2 equations

38

how to get separate equations for sampling location

separate data by sampling location and start again from beginning for each location separately