Unit 3 Flashcards

(13 cards)

1
Q

What is covariance?

A

a measure of the relationship between two random variables and to what extent they change together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is correlation?

A

quantifies the association between two continuous variables (a standardized version of covariance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is collinearity?

A

when the independent variables in a regression are correlated, each variable makes little additional contribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what three things does collinearity do?

A

can inflate the standard error (SE), indicates redundancy in predictors, and associated p-values will be too high and can affect conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is multi-collinearity?

A

more than two independent variables are highly correlated in a regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are 6 assumptions that go with correlation?

A

random sample or representative of population, independent observations, x values are not used to compute y values, x values are not experimentally controlled, both x and y follow a normal (Guassian) distribution, and all covariation is linear, and no outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

when would we use pearson correlation?

A

with continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how does spearman rank correlation coefficient work?

A

separately ranks X and Y values and then computes the correlation between the two sets of ranks, it looks at the monotonic relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is a monotonic relationship?

A

a function that either increases or decreases consistently across its domain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does the variance inflation factor show?

A

how combination of independent variables predict each other, 1=no multicolinearity, typically between 2-10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the 5 steps of principal components analysis?

A
  1. standardize the range of continuous variables
  2. compute covariance matrix
  3. compute eigen vectors and eigen values of covariance matrix
  4. create feature vector to decide what components to keep
  5. recast the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is PCA doing?

A

transforming a large set of variables into smaller ones that still contains most of the information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how does PCA reduce multicollinearity?

A

seeing how much variable account for by each component is explained by independent variables and uses new components instead of original independents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly