misc test 3 prep Flashcards

(10 cards)

1
Q

Clustering and LDA both separate data into groups. What is the main difference between Clustering and LDA?

A

With clustering, the groups are not known in advance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What question does canonical correlation analysis answer?

A

What, if any, relationships between groups of variables exist, and how strong are they?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

T or F: If the correlation between two U1 and V1 is .95, this means that there is a significant relationship between these groups of variables.

A

False. The classical correlation tells us about the strength of the relationship, not the significance. Any combination of strong/weak, significant/insignificant is possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

T or F: Scaling values before a cluster analysis can change the results

A

True. If variables have differing units, one may dominate distance calculation if not scaled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

T or F; Using Forgy usually results in more spread out intial centroids than random partition.

A

True. Since random partition averages a sample of points to find a centroid, it will be closer to the group average than a single point estimation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

T or F: If the original variables are independent, there is a unique solution for the coefficients of U1 and V1

A

False. There are many solutions, but they are (usually) scales of each other. Note that every scale of an eigenvector is also an eigenvector for the same value, and loadings are not normalized in CCA, so different algorithms may produce solutions that are scales of each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

T or F: If the original variables are dependent, there are multiple solutions for the coefficients of U1 and V1 which are not scales of each other.

A

True. We can solve for one variable in terms of the others, replace that variable within U1 and obtain a new solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

T or F: Variables with a VIF score over 5 are highly dependent on other variables.

A

True. If V IF > 5 then |r| > .89, indicating a strong linear relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A higher number of observations makes Bartlett’s test more sensitive (more likely to observe a true effect).

A

True. A higher n value makes the chi^2 value higher for the same eigenvalues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give one specific way that multicollinearity affects the various calculations we perform

A

Some potential answers:

Some variables may be overemphasized, and thus contribute more to our analysis
than others.

Solutions to maximization problems are not unique up to scaling, so solutions may become unstable.

Matrices may become singular (non-invertible) or nearly singular which may prevent us from finding inverse matrices, or make solutions more sensitive to inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly