Flashcards in BIO 300 Lab Quiz 3 Deck (38)

1

## correlation

### strength of a linear association between 2 numerical variables

2

## correlation uses

### correlation coefficient

3

## correlation coefficient

###
r

[-1,1]

unitless

4

## -r

### as one variable increases the other decreases

5

## inferences from correlation

### cannot infer causality

6

## regression

###
implies causality between 2 variables

used to predict value of response variable from explanatory variable

can determine how much of variability is due to relationship w/ explanatory variable

7

## regression statistic

###
R^2 = SS_regression / SS_total

[0,1]

8

## linear regression assumptions

###
-relationship between response (Y) and explanatory (X) is linear

-Y values at each value of X are normally distributed

-variance of Y values is same at all values of X

-Y measurements are sampled randomly from the population at each value of X

9

## are there any outliers

### yes, can be seen by boxplot

10

## how to do regression with multiple groups (N/S)

###
"regression with groups"

then add total regression line by right clicking then going to regression fit

11

## do the data need to be transformed

###
are the data clumped in one corner of the scatterplot

is there greater spread in one section of the scatterplot

are there different orders of magnitude spanned in the variables

12

## how to SLR

### stat-- regression-- regression-- fit regression model

13

## options to check for regression

###
responses- Chl-a

continuous predictor- Log P

graphs- residuals vs. fits

results- everything but Durbin-Watson

storage- residuals

14

## SSregression

### proportion of variation in response variable accounted for by the regression

15

## SSresidual

### proportion of variation unexplained by regression

16

## MS

### measure of variance, average of sums of squares: SS/df

17

## F

###
F-ratio

MSregression/MSresidual

18

## constant

###
y-intercept

note that this value will appear in the equation

19

## log P constant

###
the slope

will appear in equation

20

## better predictor of response variable

### higher R^2

21

## multiple regression

###
stat-- regression-- fit regression model

responses- chl-a

continuous predictors- Log P, Log N

22

## dont forget

###
to label residual columns

check for normality

check for equal variance

check if assumptions are met

test residuals for normality

23

## when R^2 SLR ~ R^2 MR

### possibly 2 explanatory variables are correlated

24

## if two explanatory variables are correlated

### co-linearity

25

## why did Log N lose its significance in MR

### the variation explained by Log N is already accounted for by Log P.. not much variation left that Log N can describe

26

## test for correlation

### stat-- basic stat-- correlation-- variables: logP, log N--- ok

27

## stepwise multiple regression

###
looks at all combinations of explanatory variables to retain the ones that explain the most variation

eliminates explanatory variables that do not add any new explanatory power

28

## do you need different predictive equations for the 2 sampling locations

### are the intercepts and slopes of the regression equations significantly different?

29

## test for significant differences in y-intercept of regression lines

### ANCOVA

30

## ANCOVA

### analysis of covariance

31

## ANCOVA assumes

### equal slopes (parallel lines)

32

## testing for equal slopes

###
are the lines parallel? if they are then interactions are not significant

we'll assume that they are

33

## if y-intercepts are not significantly different (p-value ≥ alpha)

### free to use one regression equation for both locations

34

## running an ANCOVA

### stat-- anova-- GLM-- fit general linear model

35

## options in ANCOVA

###
responses-- chl-a

factors- location

covariates- log P

model- location, log P, location*log P

results: only ANOVA

storage: residuals

36

## ANCOVA output

###
-if interaction is not significant re-build model without interaction

-determine if effect of sampling location is important- decide if you need 2 equations or not

37

## if sampling location has significant effect

### need to 2 equations

38