Lecture 4: Protein Sequence and Structure Determination Flashcards Preview

Biochem > Lecture 4: Protein Sequence and Structure Determination > Flashcards

Flashcards in Lecture 4: Protein Sequence and Structure Determination Deck (43):

Uses of determining amino acid sequence in a protein?

- compare with all other known sequences (including DNA) to determine whether similarities exist

- sequence comparison of the same protein in different species can yield cluse about evolutionary pathways

- sequence comparison of the same protein in the same species (ie humans) can reveal the molecular mechanisms of (genetic) disease

* conserved amino acids (between very different species) suggest that a pathway is very significant


HCl - Determination of sequencing amino acid composition


- hydrolyze polypeptide in HCl at 110 c for 24 hours

- amino acids can then be separated by ion exchange chromatography

- ninhydrin can be used to ID amino acids

- modifies asparagines and glutamines to aspartic acid and glutamic acid


What does ninhydrin do?


- chemically modifies asparagines and glutamines to aspartic acid and glutamic acid


Elution profile of amino acids



- Peaks are in a ratio

- double height means double of that amino acid compared to another


- if asparagine is present is degrades it into aspartic acid and NH3 (see those peaks)


Purpose of cleavage and reduction of polypeptides


- accuracy of amino acid sequencing generally declines as the length of the polypeptide increases

- must be enzymaticaly (proteases) or chemically fragmented to be sequenced efficiently

- if disulfide bridges are present, they must be broken (reduced) and the resulting cysteine-sulfhydryl groups prevented from reformation of disulfide bonds (modified) [add reductant and then modify sulfhydryl groups]


What is Edman degradation?

- sequentially removing one residue at a time from the amino end of a peptide (fragment)



Steps of Edman degradation


1. phenyl isothiocyanate reacts with the uncharged terminal amino group of the peptide to form a phenylthiocarbamoyl derivative

2. the cyclic form of the derivative is liberated and can be identified by chromatographic methods

3. in multiple labeling-release rounds, the amino acid sequence of a peptide can be determined


* keeps attackign N-terminus over and over to remove successive amino acids


Overlapping and Edman degradation


- divide the same polypeptide chain by different segments

- arrange the segments so the two kinds overlap

- this way you can tel how to connect the segments


What is mass spectrometry?


- analytical technique that measures the mass to charge ratio (m/z) of charged particles in a gas phase



Steps of mass spectrometry


1. molecules to be analyzed (anylate) are first ionized in a vacuum

2. when the newly charged molecuels are introduced into an electric  and or magnetic field, their paths through the firls are a function of theis m/z ration (mass/charge)

3. measure property of the ionized species can be used to deduce the mass (M) of the anylate with high precision


Three essential components of mass spectrometer


- ion source

- mass analyzer

- detector



Conversion of macromolecules into gas phase ions required?

conversion of macromolecular anylates such as proteins and nucleic acids into gas-phase ions (ionization) could not be achieved efficiently until the development of

--> electrospray ionization mass spectrometry (ESI MS)

--> matrix assisted laser desorption/ionization mass spectrometry (MALDI MS)


Mass to charge ration of ions and molecular mass of proteins


- gas phase macromolecules acquire a variable number of protons and thus positive charges, from the solvent, which creates a spectrum of species with different mass to charge ratios

- each successive peak corresponds to a species that differs from that of its neighboring peak by a charge difference of 1 and a mass difference of 1.

-the molecular mass M can be determined from any two neighboring peaks


Mass Spectrometry Equations


compare two peaks

p1 = (M + z1) / Z1

p2 = (M + z1 - 1) / (z1 - 1)

- just subtract everything by 1

- adjacent peaks have difference on 1 in protons and in mass


can solve for M and z1

1. M = z1 (p1 - 1)

2. z1 = (p2 - 1) /(p2 - p1)


ESI MS process


- analyte solution is passed through an electricaly charged nozzle into a chamber of low pressure, evaporating the solvent and ultimately yielding the ionized analyte


How fast do different sized particles move in ESI MS


more charge/more mass --> slower

less charge/less mass --> faster, detected first


MALDI MS process


- analyte solution is evaporated to dryness in the presence of a volatile, aromatic compound (the matrix) that can absorb light at specific wavelengths

- laser pulse tuned to one of these wavelengths excites and vaporates the matrix, converting some of the analyte into gas phase

- subsequent gaseous collisions enable the intermolecular transfer of charge, ionizing the analyte

* matrix is important to keep proteins from getting burned, makes sure they end up in gas form


MALDI TOF analyzer

1. protein sample is ionized

2. electrical field accelerates ions

3. lightest ions arrive at the detector first

4. laser triggers a clock


TOF and mass to charge ratio


mi/zi = 2eEl/(ti/ld)2


* mass to charge ratio is dependent on time

plug in time and get out the mass/charge ratio


--> first particles to arrive have the largest mass/charge ratio


Tandem mass spectrometry


- alternative to edman degradation as a means of sequencing proteins

- ionized proteins are analyzed by a first mass spectrometer and then broken down into smaller peptide chains


Steps of tandem MS


ion source --> MS1 --> collision cell --> MS 2 --> detector


1. catch protein as it arrives after a certain time

2. analyze it further


* pretty difficult to do


From protein mixtures to peptide mass identification


protein mixture (separation/2D electrophoresis) --> individual proteins (fragmentation/spot excision, site specific cleavage) -->

peptide fragments (fingerprinting/ mass spec) -->

peptide mass spectrum (database search/MSfit, MASCOT, Proetinlynx) -->



Unique peptide sequences


- most of at least 6 amino acids are unique in the proteome of an organism and map to single gene products


Effect of sig figs in m/z searches

no decimals could lead to MANY hits (478)

going up to 4 decimals could lead to only 2 hits


Determination of fidelity of "known" sequences


- recombinant proteins

- synthetic peptides


detection of natural or biosynthetic mutations



in vitro mutated proteins - random or site specific



identification of endogenous postranslational modifications


phosphorylation - regulatory or catalytic

disulfide bonds


identification of experimental chemical modifications


affinity or group secific labels


X ray diffraction general steps


crystal --> diffraction pattern --> electron density map --> atomic model



Componnts in an X ray crystallographic analysis


- an xray source generates a beam, which is diffracted (scattered) by a crystal

- the resulting diffraction pattern is collected on a detector

- want a lot of crystals for it ro work

- to get crystals get close to the ppt point and crystals will sometimes form


Physical principles of x ray crystallography


1. electrons scatter x rays --> the amplitude of the wave sattered by an atom is proportional to its number of electrons

2. the scattered waves recombine

3. the way in which the scattered waves recombine depends only on the atomic arrangement


Resolution and electron density


- the better the crystal the better the resolution of th eimage




- carried out on macromolecule sin solution (xray crystalloggraphy is limited to molecules that can be crystallized)

- illuminate the dynamic side of protein structure, including conformational changes, protein folding and interactions with other moelcules

- depends on the fact that certain atomic nuclei are intrinsically magnetic due to a nuclear spin angular momentum, which can take either of two orientations or spin states calle a and b when a magnetic field is applied


spin states

- the enrgies of the two orientations of a spin 1/2 (such as 1H) depend on the strength of the applied magnetic field

- absorption of electromagnetic radiation of appropriate frequency induces a transition from the lower to the upper level (and resonance will be obtained)

- change of energy of one H will be transferred only to others in close proximity

* see chemical shifts of Hs on the neighboring C


2D NMR - overhauser effect

- identifies pairs of rpoteins that ar ein close proximity


- whatever is on straight line has no interactions

- dots off the line interact

- when proteins fold Hs that are far away int he chain still come into contact and interact



- biochemical entities related by common ancestry

- detectable by significant similarity in nucleotide or amino acid sequence and is (alomst always) manifested in three-dimensional structure




homologs that are present within one species

- usually a rsult of gene duplication

- often differ in their detailed biological function

- different versions of the same protein for different tissues


* evolution --> develop from existing gene "tamplate" and are altered from there --> every gene begins from another




- homologs that are present within different sepcies and have very similar or identical functions


statistical analysis of sequence alignments


- significant sequence similarity between two DNA, RNA or protein molecules implies that they are homologs and hence have same evolutionary origin

- as proteins are composed of a larger number of building blocks (20 aas) than DNA or RNA (4 nucleotides) random sequence agreements are less likely



alignment algorithms to compare sequences


- identities

- gap insertion

- conservative sumstitution


conservation of 3d structures

3d structures of proteins or RNA relate directly to their functions and hence are more evolutionarily conserved than primary structures or amino acid sequences

- similarities in structures:

--> can be detected without significant similarities in sequence aliggnments

--> usually indicate common functional mechanisms


similar structures but different functions


- in primates a-lactalbumin expression is upregulated in response to the hormone prolactin and increases the production of lactose

- lysozymes are generally enzymes that damage bacterial cell walls by hydrolyzing their petidoglycan component


Convergent evolution


some proteins are structureally and funcitonally similar in many important ways but do not have a common ancestor

--> very different evolutionary pathways can elad to the same biochemical solution

chymotrypsin vs subtilisin