Bioinformatics 3 Flashcards Preview

MSF1 > Bioinformatics 3 > Flashcards

Flashcards in Bioinformatics 3 Deck (19):
1

How do Hidden Markov models expand on scoring matrices (BLOSUM) and Profiles (PSSM)?

The include Markov chains - amino acid at one position can influence what will come next. (this allows the probability of the next AA to be calculated based on the previous one).
It also introduces gaps.

2

How do you create a HMM?

Build a MSA of homologous sequences (using e.g. ClustalO).
There is a probability at each position for a match, mismatch, deletion or insertion.
The HMM is then built by traversing the alignment and calculating the probability for each possible transition between alignment positions.
Each transition possibility has a probability score.
the overall score is calculated by multiplying transition scores together.
Overall score is then converted to E-value.

3

What program can be used to create and/or search HMM databases?

HMMER.

4

Which methods can be used for experimental structure prediction?

X-ray crystallography
NMR
Cryo-EM.

5

What is the aim of secondary structure prediction?

to identify local structure - alpha helix, beta sheet and random coil.

6

What is the name given to 3 state prediction (alpha, beta, coil)?

Q3

7

Why is it possible to predict structure from sequence?

Because the theory is that to a large extent the local sequence determines the local structure.

8

Name a secondary structure prediction program?

Jpred

9

How is the accuracy (Q3) of secondary structure prediction calculated?

Accuracy = (no. of residues correctly predicted)/(total no. of residues)

10

What are the parameters of Q3?

Q3 is given a value between 0 and 1.

11

What does Q3=1 indicate?

A perfect prediction.

12

What does the Q3 result of a random prediction depend on?

The percentage of the different states.
e.g. Equal amounts in each state Q3 =33%.

13

Why is Q3=1 unrealistic?

Because secondary structure assignment in protein structure is uncertain up to about 10% - so perfect Q3 =0.9

14

What is the similarity between Jpred software and SNP prediction software?

Both of their algorithms are trained.
Jpred - Trained on sequences with known structures.
SNP - trained on sequences with known SNPs.

15

What does the Jpred algorithm use to predict secondary structure?

Uses PSI-BLAST, MSA and HMM

16

How is the Jpred algorithm refined?

Known sequences analysed multiple times and the algorithm is modified each time to find the best prediction method.

17

Explain the Jpred algorithm.

Query is searched with PSI-BLAST (UniProt database) for 3 iterations.
This alignment generates the parameters:
PSI-BLAST profile frequency
PSI-BLAST PSSM
MSA - scoring using BLOSUM62
HMM from the aligned sequences.

18

The alignment produced by PSI-BLAST in the Jpred algorithm is modified by post-processing, what does this mean?

the gaps in the query and aligned sequences are removed.
This improves the prediction accuracy because regions where there are gaps in the query sequence will most likely be in the coil state and this state has no effect on the prediction.

19

What are solvent accessibility predictions?

Extent of the Van der Waal’s surface of each amino acid residue that is exposed to the solvent surrounding the protein.