Bioinformatics 4 Flashcards

Question 1

Q

What are the benefits of predicting the protein fold?

Answer

A

It benefits medicine for drug design and biotechnology for design of novel enzymes.

Question 2

Q

What assesses these programs for protein fold prediction and how?

Answer

A

CASP (Critical Assessment of Techniques for Protein Structure Prediction)
The software is assessed by giving each software with a known protein structure and then seeing what it predicts it to be.

Question 3

Q

How are the structures often predicted?

Answer

A

Through homology - similar sequences tend to fold in similar ways.

Question 4

Q

BLAST can only identity homologues with >40% identity, what other programs can be used to find homology?

Answer

A

PSI-BLAST and HMM.

Question 5

Q

What program was developed by the Sternberg group at imperial?

Question 6

Q

How does Phyre work?

Answer

A

Works by searching the 10 million known sequences for homology using PSI-BLAST and captures the mutational changes at each position in the protein and creates an evolutionary fingerprint.
It then runs every known protein structure’s (65,000) sequence through PSI-BLAST this then creates a HMM from all the sequences with a known structure.
Finally, the query sequence has already been run through PSI-BLAST so then a HMM is created for it. The HMM for the query sequence is then compared to the HMM database of all known protein structures. When a good match is found a 3D model will be produced with a value of confidence.

Question 7

Q

What is a phylogenetic tree?

Answer

A

An prediction of the ancestry of a protein.

Question 8

Q

What are the 3 main tree building algorithms?

Answer

A

Neighbour Joining
Maximum Parsimony
Maximum Likelihood

Question 9

Q

What do these trees identify?

Answer

A

Phylogenetic trees identify the closest related protein to the one you are working with.

Question 10

Q

What is the first step to building a tree? (common to all algorithms)

Answer

A

The first step to building a tree is to produce a MSA.

Question 11

Q

What are the 3 major categories of tree building methods and which algorithms do they include?

Answer

A

Distance based methods - neighbour joining.
Character based methods - Maximum Parsimony and Maximum Likelihood.
Bayesian - method similar to maximum likelihood.

Question 12

Q

How does a distance based method work?

Answer

A

Distance methods uses a MSA to calculate pairwise distance, or the number of changes between each pair of sequences in a group.
This creates a distance matrix which can be used to produce a phylogenetic tree.

Question 13

Q

What are the advantages of the Neighbour Joining method?

Answer

A

Fast and can handle many sequences.

Question 14

Q

Neighbour Joining does not assume a ultrmetric tree, what is this?

Answer

A

Anultrametric treeis a special kind of additive tree, the “tips” or terminal nodes are equidistant from the root. Ultrametric trees can thus depict evolutionary time.

Question 15

Q

What are the limitations of a Neighbour joining?

Answer

A

Lacks any sort of tree search and optimality criterion and so there is no guarantee that the tree produced is the best fit for the data.

Question 16

Q

Explain Maximum Parsimony method.

Answer

Study These Flashcards

A

Builds a tree from finding the paths with the minimum number of mutations required at each point to go from one sequence to the other.
To begin it performs a MSA and identifies informative sites.

Question 17

Q

What is an informative site?

Answer

Study These Flashcards

A

An informative site is one where there are at least two different kinds of nucleotides at the site, each of which of which is represented in at least two of the sequences under study.

Question 18

Q

Explain the Maximum Likelihood method.

Answer

Study These Flashcards

A

Creates all possible trees using the Maximum Parsimony method but also uses a model of evolution whereby different rates of mutation can be used.
GAU –> UGU is in fact 2 changes not one - uses prior knowledge.