Trees - cluster analysis Flashcards

1
Q

What is the earliest quantitative method of tree construction

A

cluster analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does cluster analysis look at

A

overall similarity - how much like each other are things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the assumption behind cluster analysis

A

species that share a most recent common ancestor should be more similar to each other than to any other species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you do a cluster analysis from a character matrix

A

you have to convert it to a similarity matrix or a dissimilarity/distance matrix which is known as p-distance or Hamming distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some methods of performing a cluster analysis

A
  • least squares method
  • NJ (neighbor joining)
  • UPGMA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the criticisms by cladists when it comes to a distance matrix for cluster analysis

A
  • there is a loss of information: no distinction made between shared derived and shared primitive characteristics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The Mean character difference used for cluster analysis is also called what

A
  • Manhattan squares or taxicab geometry –> you can find the hypotenuse of a triangle with these values (the hypotenuse is the Euclidian distance - think sqroot(character1 difference squared + other character distances squared)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The UPGMA method for clustering is usually attributed to what people

A

Sokal and Michener

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a major problem with the UPGMA method

A
  • it assumes that all groups evolve at the same rate - which is often not true (so this doesn’t account for unequal divergence rates?)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What clustering algorithms try to compensate for unequal divergence rates unlike UPGMA

A
  • least square methods: Here the best tree is the one that minimizes the sum of the squared differences between the true Dij values and the ones predicted on the tree dij
  • Neighbor joining method (saitou and Nei): this works by clustering but does not assume a clock. This seems to perform better than UPGMA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the least square method

A

this is a clustering algorithm where the best tree is the one that minimizes the sum of the squared differences between the true Dij values (this is Euclidian distance values) and the ones predicted on the tree dij

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the main criticisms of distance based approaches

A

some information about the data may be lost due to conversions: like going from character matrix to distance matrix and then to a tree
and the assumption of equal rates is questionable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the main advantage of distance based approaches

A

they are fast, and some methods like UPGMA and NJ can give a precise single answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly