Protein structure prediction Flashcards

Question

What is another template based modelling program?

Answer 1

1. You take fragments 2. Predict the possible structures for the given fragment 3. Trial structure for local sequence taken from database of segments of known 3D structure 4. You put fragments together and check if if makes sense 5. You can make changes to check if the change solution gives you a better structure and then you can either discard it or keep it

Answer 2

-Fragment-based methods could sometimes give reasonable predictions but sometimes fail * Can be integrated with template methods to fil gaps or uncertain regions * I-TASSER (Zhang) and Robetta (Baker) widely used * Now superseded by deep learning e.g. AlphaFold

Answer 3

-residues that interact with each other tend to evolve together as well - coevolution - so coevolution gives you some info about the structure

Answer 4

The input is a multiple sequence alignment (MSA) of the query sequence  In additions, known PDB structures provide structural data known as “templates”  Two track learning called evoformer and structure  First stage called evoformer features including residue-residue contacts at different distances (distograms)

Answer 5

The second stage of learning is the “structure” network  Each residue is an independent unit (termed “gas”) and they are not linked together.  Position of the main-chain residues then predicted  Then the side-chains fitted  The learning is termed “end-to-end” so the function optimised (“loss function”) is the difference between the final model and the true structure and al steps learnt together  The algorithms also predicts the expected accuracy of each part of the model (see later slides)

Answer 6

Finally the structure is refined using molecular dynamics using Amber – but this did not improve the model in terms of RMSD but did correct some local stereochemistry.  AlphaFold does not distinguish between template-based and ab- initio approaches.  AlphaFold does use the information from homologous structures but this is within the deep learning  AlphaFold does not use the Phyre/SwissModel approach of starting with a known template and using that as the starting point

Answer 7

Per residue confidence metric pLDDT (colour coded on EBI models) on scale of 0 – 100 * pLDDT stands for predicted Local Distance Difference Test * LDDT measures local agreement between two protein structures * pLDDT > 90 are expected to be modelled to high accuracy. * pLDDT between 70 and 90 are expected to be modelled well (a generally good backbone prediction). * pLDDT between 50 and 70 are low confidence and should be treated with caution. * pLDDT < 50 often shown as having a ribbon-like appearance and should not be interpreted – often disordered regions

Answer 8

PAE – Predicted Alignment Error * How well predicted is the distance between two residues * Assess confidence of domain packing * Colour coded Regions with very low PAE can be totally misplaced relative one another Below the extracellular and intracellular regions pack which is biologically impossible

Answer 9

Models for >200M proteins – Amazing resource!  Models for 98.5% of human proteins  But only ~58% of residues in human proteome predicted with high confidence  Compare PDB + Phyre which is ~ 53% of residues

Protein structure prediction Flashcards

(35 cards)