Flashcards in Rafferty Deck (191)
What is proteomics?
- the qualitative and quantitative comparison of proteomes under diff conditions to further unravel biological processes
What is the proteome?
- sum of all the proteins in an organism, a tissue, a cell, a subcellular organelle or simply the sample being studied
Why study proteomes, and not just genomes?
- genomes are (largely) fixed, but proteomes dynamic
How much do prot exp levels vary?
- varies hugely
- and not always correlated w/ signif, ie. some of the most important key regulators and signalling mols have v low exp levels
- anywhere between 30-80% of genes expressed at any 1 time in a cell/tissue
How well do the transcriptome and proteome correlate?
- if plot prot abundance against mRNA abundance, then quite good correlation at higher levels
- but when v few levels of particular mRNA transcripts then no.s don’t correlate well
Why does the mRNA seq ≠ prot seq?
- PTMs: permanent and temporary
--> prot splicing (inteins)
--> additions/deletions, eg. glycosylation (one state of prot might be in active form and some in inactive form, and this has effect)
How does proteomics account for diffs between mRNA seqs and the prot seq?
- proteomics seeks to identify and quantify all the prot components taking into account the variations
What diff proteomic analysis methods are there?
- 1D = polyacrylamide gel electrophoresis (PAGE)
- 2D = SDS-PAGE
- liquid chromatography (LC)
- mass spectrometry
- prot microarrays
What are the advantages and disadvantages of 1D PAGE?
- easy and quick
- but poor resolution --> as shows so many proteins, one band may represent many diff prots w/ varying dominance
How does 2D PAGE work?
- separate by protein pI in 1st dimension --> isoelectric focusing gel
- then separate by mass in 2nd dimension = SDS-PAGE
What are the advantages and disadvantages of 2D PAGE?
- each indiv protein could be a single signal, but at most 4/5
- resolution still not great, and can be misled by multiple proteins becoming 1 signal, but much better than 1D
- problems w/ reproducibility, but has become more routine
- messier to set up and labour intensive
What is an eg. of how 2D PAGE can be a useful technique?
- comp B. thailandensis (not pathogenic) w/ B. pseudomallei (v pathogenic)
- v similar genomes, so comp proteomes, can see what diffs there are and ask if these particular prots have a functional role in pathogenesis of B. pseudomallei
How does liquid chromatography (LC) work?
- separation of whole prots or peptide fragments in solution
- can detect w/ anion/cation, reverse phase (RP) (looks at hydrophobicity properties), affinity (‘natural’) or tags
--> fractionation of samples
What is the main technique behind proteomics?
- mass spec
How does mass spec work?
- separate samples on the basis of mass (m) : charge (z) ratio (m/z)
What are the diff parts of a mass spectrometer?
- a part to prod ionised forms of sample in the gas phase
- a device to separate ions out by m/z ratio
- a device to detect diff ions and gen a signal
What state must ions be in for mass spec?
How can ions be ionised for mass spec?
- some species naturally ionised
- or can be gen by molecular collisions --> typically addition or removal of protons:
M + nH+ → [M + H]n+
(or M → Mn- + nH+)
What are the 2 primary means for generating ionised samples?
- Matrix-assisted laser desorption ionisation (MALDI)
- Electrospray ionisation (ESI)
How does MALDI work?
- sample mixed w/ matrix compound
- laser strikes matrix, puts energy into system and species becomes ionised
- causes particles to fly to 1 side of chamber (ie. if gen +ve species then to -ve side of chamber), through chamber wall and these charged gaseous particles can enter next stage of process
What is an advantage of MALDI?
- can archive sample as don’t blast everything in 1 go and can reanalyse
How does ESI work?
- solution of proteins/peptides fired down finely drawn capillary
- end of capillary is charged, so eventually fire out charged droplets from the end
- then put them into evaporation chamber
- droplets become smaller as lose water
- so repulsion increases and eventually burst apart into gas phase
What is an advantage of ESI?
- can use for larger prots, as charge picked up is greater, so m/z ratio is smaller, which is more tractable for downstream analysis of whats in the sample
Is MALDI or ESI used more?
Can MALDI/ESI be performed on all samples?
- some samples can only be used w/ 1 (but this is only around 20%), eg. some species don’t fly so difficult to get them into gaseous phase
What diff approaches/devices are used to separate out the diff ion species based on their m/z ratio, ie. mass analyser?
- time of flight (TOF) (ie. speed of movement), gives indication of size of particle
- ion trap
- fourier-transform ion cyclotron resonance
How does TOF work as a mass analyser?
- can shoot down long tube from source to a detector
- or can make path longer, to increase resolution, as has to travel further
- ions w/ smaller mass can travel faster, so can determine sample composition by order hit the detector
How do quadrupoles work as mass analysers?
- can select out diff charge ratios from sample, instead of waiting for them to arrive at the detector (as in TOF)
- applying radiofrequency field to change rod charge to divert particles of given size, so only certain particles pass all the way t/ to detector at far end
- can correlate this to size of particles
How do ion traps work as mass analysers?
- can be even more selective by holding sample of charged particles in chamber and call them out 1 at a time
- fire them into a chamber
- ringed electrode around interior of chamber holds mass of charged particles in a bunch
- 2 more electrodes on ends of chamber, 1 where come in and 1 where leave --> can adjust this charge to pull out certain particles and cause them to fly out to detector
Are mass analyser methods used alone?
- often used in conjunction and not in isolation
What can ion traps often be used in conjunction w/ and why?
- quadrupoles to fractionate sample first
How does FT-ICR work as a mass analyser
- hold pulse of samples in chamber, then apply field to get them to circulate
- causing them to start to split up into smaller and larger ones etc.
- this gives back a signal, as they pass detectors
- speed of circular motion is dep on size of particle
- can thus measure path of particle in chamber and directly measure
How does orbitrap work as a mass analyser?
- combines a no. of other diff approaches
- put packet of charged particles into chamber, traps them, then does direct measurement to find ions
- enables interaction w/ sample and to scan across diff ranges, to prod spectrum
What diff types of mass spec detectors are there?
- channel electron multipliers (CEM)
- micro channel photomultiplier plates (MCP)
How do mass spec detectors work?
- convert impacts of ions on their surfaces into electric voltage signal that can be amplified and read-out to detect presence and quantity of an ion species
How do types of detectors vary according to the mass analyser?
- detection system designed to suit types of devices upstream
--> quadrupole and ion trap = CEM
--> TOF = MCP
- in FT-ICR and Orbitrap the analyser itself is also the detector
What does a simple MS spectrum show?
- shows signal strength of diff mass charge ratio values
- get set of diff peaks, as diff mols have diff charges but same mass, hence diff ratio
- this is used to work backwards and find mass of particle which would give this distribution of peaks
What is isotopic abundance and how is it found?
- get isotopic mass envelope of sizes, as when C12 replaced by C13 then affects peaks
- calculate monoisotopic mass back from these envelopes
- req upstream processing of data
Why is high power computing req to interpret complex spectrums?
- w/ multiple prots get overlapping signals
What do spacings between peaks in an isotopic envelope reflect?
- both mass change and charge of the ion species
How is the average mass of a peptide or prot envelope calculated?
- calc from the centroid of the distribution not just the middle value of the distribution
What is resolution in terms of mass spec, and how is this measured?
- the ability to detect 2 diff ions
- often measured as the mass divided by the peak width at half max height, so bigger values = better resolution
How is accuracy of mass spec calc?
- mass (exptal) - mass (theoretical)
/ Mass (theoretical)
- expressed as parts per million (ppm), thus lower values show higher accuracy
How is species charge (z) calculated?
- if 2 peaks can assume that heavier comes from the mass of the protein plus an extra proton
- mass charge ratio = mass + any charges / charge
- do simultaneous equations and can rearrange to find value of z
After z calculated, how is mass spec next analysed?
- then able to calculate mass
- use mass to search database for corresponding known prot mass
- okay if known prot as can compare this to expected mass, but if problems if doing proteomics of large no.s of prots in a sample
What factors can lead to misidentification or failure when searching prot database?
- PTMs that may mislead identification
- prot degradation --> a fragment created in handling w/ a chance match to a known prot
- relative concs from single intensities can be misleading if a particular ion is suppressed (due to problems w/ getting certain ions into gas phase)
- incomplete database --> not all organisms have fully sequenced and annotated genomes (but this is becoming less true)
Why could peptide MS be carried out instead of whole prots?
- aids mass identification
- multiple matches gives confidence to assignment, as can characterise the multiple peptides of the prot
How can proteins be cleaved into smaller peptide fragments for peptide MS?
- chemical fragmentation (eg. cyanide bromide) or enzymatic fragmentation (eg. trypsin)
- have reasonably predictable formation (eg. after Lys or Arg) of peptides of suitable size
- trypsin used as cuts v specifically
- match observed sizes against database --> get a ‘peptide mass fingerprint’
- prod similar spectrum, but shifted to smaller set of units than whole prot
What problems can there be w/ peptide MS and how are these resolved?
- occasional problems w/ size ambiguities (but multiple peptides makes this less likely) and again issues about modifications
- can also consider the results of missed cleavage sites, ie. if 2 peptides joined together, and these themselves can be quite diagnostic
What is MS/MS?
- use of 2 mass analysers to separate/select ion species linked by an additional fragmentation
How does MS/MS work?
- select out 1st ion size, put into chamber and break up by inducing collision to cause dissociation, then separate out ions and generate spectrum
- ratio of precursor to product ion intensities reflects energy of collision
How does peptide ion fragmentation work in MS/MS?
- can cause peptide to break on bonds down main chain
- tend to favourably break at peptide bonds
- resulting in sub-fragments of original peptide
- this additional dissoc gens ‘immonium ions’ whose mass dep on R and are thus characteristic of each AA --> so can identify seq
How is seq identification carried out after MS/MS?
- using b and y series product ions --> extend from N and C-ter respectively
- b-series and presence of immonium ions elsewhere in spectrum also give support to seq interpretation
- sequencing peptides can stop ambiguity by identifying prots which have same mass, but diff sequences
How is database interrogation carried out for MS/MS results?
- a no. of program suites used w/ output from MS analysis to interrogate seq database for ‘hits’ or matches to predicted peptide fragments of all prots in a given genome(s)
- eg. MASCOT, SEQUEST, ProteinPilot
What tolerances can be set w/in database searches for the seq match?
- define how sample prepared
- accept poss missed cleavage sites
- permit inaccuracies in measured values
- allow for fixed and variable PTMs
- inc labelling of samples
Why is the location of a PTM important?
- exact location of mod is essential in many circumstances, eg. phosphorylation
- so if seq can discriminate between when phosphate attached to diff AAs, then can be fundamental to understanding of what happens when treat cell w/ particular inhibitor or agonist etc.
How can the m/z of peptides w/ phosphorylation in 2 diff places be distinguished?
- only by MS/MS sequencing
How can MS/MS be used to distinguish between PTMs?
- can scan for presence or loss of specific ion (eg. phosphorylated residue), as corresponds to release of H3PO4
- on spectre bigger mass corresponds to protein plus eg. phosphate group
- also important to know how this changes w/ time or upon addition of other compounds, eg. signalling molecules
- mixed pops of modified and unmodified samples can be studied simultaneously and picked up in the spectre
What is a workflow for MS/MS?
- cell sample
- protein mixture of cell sample
- affinity column so not looking at every protein in sample, eg. look at phosphoproteins (can skip this step) --> so studies focussed on modified prots in a sample, eg. the phosphoproteome
- then enzyme/chemical cleavage
- can enrich for certain peptides, eg. IMAC (immobilised metal ion affinity column)
Apart from the presence of a prot in a sample what does proteomics seek to measure?
- the quantity in absolute or relative terms
What approaches are there in proteomics to measure the quantity of prot in a sample?
- label free = does not req mod of sample
- label-based = addition of tags or use of stable isotopes and measure the levels of these
What is the steady state of prot abundance set by?
- transcrip rate (how fast pol can travel)
- translation rate
- RNA decay rate (how susceptible is transcript to degrad)
- prot decay rate (varies widely between prots)
Is MS quantitative?
- not inherently, the response of diff samples will vary based on their own unique properties
- amount 'A' cannot be compared directly to amount of 'B'
How can MS be quantitative
- amount of ‘A’ at time point 1/treatment 1 can be compared against amount of ‘A’ at time point 2/treatment 2
- if samples handled in exactly the same way
What are diff approaches to quantifying prots t/ MS based on?
- proteome coverage (do you want to sample all proteins present, or just a targeted subset)
- dynamic range of sample abundance being measured
- quantitative accuracy req (do you need to know how much exactly or just theres lots of it)
- no. of samples being compared (diff timepoints?)
What are the 2 methods for label-free quantification?
- spectral counting
- ion signal intensity from chromatograms
How does spectral counting quantify prot abundance?
- no. of times (no. of spectra) that a peptide is seen within the collection of a data set indicates abundance
How does ion signal intensity from chromatograms quantify prot abundance?
- experiment links given m/z value to area of peak in LC chromatogram
- combine values for all/multiple peptides from a particular prot
What are some problems with using ion signals from chromatograms to quantify prot abundance?
- needs to be reproducible so reliable and req high mass accuracy for identification
Why are both label-free methods of quantifying prot slightly relative?
- as giving amount as % of total prot in sample
How could label-free methods give an absolute quantity?
- could spike samples w/ heavy peptides
- this is a known peptide of known amount (v low levels), should match peptide in sample but slightly higher isotope, so overall behaviour the same, but shifted slightly on mass spec
- so can be confident when measure actual sample that both will have flown in same way
Do label-free methods involve perturbing the system?
- no, apart from spiking samples
What diff label based quantification methods are there?
- in vitro = attach labels to peptides before/after proteolytic digestion --> ICPL, ICAT, iTRAQ, TMT
- or in vivo = metabolic labelling of samples w/ stable isotopes --> SILAC
What is ICPL and what does it involve? (label based quantification)
- isotope-coded protein labelling of lysine sidechains
- can do variety of diff labels, standard is C12 = ICPL 0
- can build labels where add deuterium, looks chemically the same but heavier mass
- handle and treat in same way, but each reagent subtly diff in mass
- mix together and put in mass spec
- minimalises handling errors
What is ICAT and what does it involve? (label based quantification)
- isotope coded affinity tags
- bind to cysteine
- take mixture 1 and label all w/ light version of reagent, then have a heavy version (eg. w/ deuterium)
- advantage is can deliberately pull out prots which have been labelled as is a biotin tag, and this could simplify mixture, therefore just analyse this subset
- look to see how signal has changed, this gives a relative measure
What is iTRAQ and what does it involve? (label based quantification)
- isobaric tags for relative and absolute quantification
- amine group on end where attach tag
- can adjust mass of tag
- can make heavier linker etc
- cause collisions to break tag off
- so measuring mass of tag instead of peptide and tag (shifts spectre)
What is TMT and what does it involve? (label based quantification)
- similar to iTRAQ
- tandem mass tags (also isobaric) w/ variability via use of diff isotopes
- same principle, attach tag by amine group, select out peptides, then cause collisions in mass spectrometer, then look for tags
How can iTRAQ be used to look at how given samples evolve over time?
- label samples and after mixing, carry out MS/MS analysis --> combine digests to minimalise handling errors
- diff isobaric tags fragment in diff ways to give diff characteristic reporter ions
- same size peptides w/ mixture of all tags selected at MS1
- identity and relative levels quantified for each peptide at MS2
Why is iTRAQ a long process?
- have to take 1 peptide at a time and bash them against themselves
How easily do tags break off in iTRAQ?
- designed so break off readily, v efficient cleavage
How does SILAC work?
- in vivo method
- stable isotopic labelling of AAs in cell culture
- eg. 13C, 15 N-lysine
- could culture cells and feed them particular AAs to make pop of prots slightly heavier, so can run 2 cultures together and compare them
- mix labelled and unlabelled cells
- in gel or in solution approach
- measure what happens to cells over course of time
How does SILAC compare to chemical tags?
- not as invasive as chemical tag, but usually incorporation is as efficient
What is the prot coverage, quantification and no. of samples like for SILAC?
- prot coverage reasonably good and quite precise
- but quantification is relative, not absolute measure
- can do 2-3 samples, only a limited amount of labelling can do
What is the prot coverage, quantification and no. of samples like for chemical mod?
- reasonably good coverage, but not all prots will have available residues in suitable location
- can do more sample as can prod large no. of diff tags
What is the prot coverage, quantification and no. of samples like for label free?
- excellent proteome coverage as not doing anything to system so see everything
- but accuracy limited by how well can track back samples etc.
- can make absolute measures
What are the major goals of proteomics?
- identification of indiv prots (what's there?)
- quantifying prot levels (how much is there?)
- state of prot under diff conditions
- presence and composition of large prot complex, ie. what’s interacting
What goal of proteomics is difficult to achieve from a simple MS based analysis?
- looking at prot complexes
What range of methods is available for identification of prot complexes?
- molecular, pairwise detailed studies (eg. X-ray diffraction, NMR) to cellular networks (eg. affinity pulldown assays, co-immunoprecipitation)
What Ab based approaches are there for identifying prot complexes?
- Ab capture
- epitope tagging
- tap tagging
How does Ab capture work?
- pull down assays of bait prots plus binding partners
How does epitope tagging work?
- add simple epitope tag via recombinant expression to bait protein and then pull down bait and partners
How does tap tagging work?
- double tags, eg. IgA and calmodulin, w/ cleavable linker on bait prot for highly efficient purification w/o overexp
In what situation is an Ab based approach best?
- for low abundance levels
How are prot microarrays used to identify prot complexes?
- use of DNA microarray tech to prod ‘prot chips’
- detection by various methods, eg. fluorescence labels, surface plasmon resonance (SPR) and surface enhanced laser desorption/ionisation mass spec (SELDI-MS)
- look what binds and see what levels of prots there are
What diff types of prot microarrays are there?
How do analytical prot microarrays work?
- Ab arrays aimed at a set of target prots
- typically used to measure exp level and binding affinities of prots
How do functional prot microarrays work?
- reqs exp prots from all ORFs
- prots are tagged (eg. His tags or TAP tags) and attached to a coated glass slide
- investigate prot-prot, prot-ligand (lipid, DNA, drug, peptide) and prot-cell interactions
How do reversed phase prot microarrays work?
- arrays of complex mixtures such as cell lysates arrayed on nitrocellulose slide, probed w/ Abs against target prots
- then Abs detected w/ fluorescent/chemiluminescent/colorimetric assays
- for quantification reference peptides printed on slides
What is protein-omics?
- use of Ab approach to utilise high sensitivity of Abs to target systems w/ the breadth of mass spectromic methods
What are the diff types of protein-omic experiments and how do they involve?
- forward phase array = simple binding of target prot(s) carrying reporter
- sandwich array = binding of prot(s) to which 2nd reporter labelled Ab binds, must prod 2nd Ab to another epitope so is more complex
- reverse phase array = more elaborate, prots attached to chip and probed w/ 1° and 2° reporter labelled Ab
- micro western array = mini gels probed w/ labelled Ab, v complex to manufacture and expensive, but is feasible
What are the ErbB prots and what is their role?
- epidermal GF receptors
- initiate signalling networks for migration, adhesion, growth and apoptosis
What residues do ErbB prots contain and what is the significance of this?
- phosphorylated tyrosine residues
- human genome contains 153 doms (SH2 or pTB) identified as likely to bind phos tyrosine --> so may interact w/ 1 or more of these prots
What was the aim a paper probing ErbB receptor binding?
(Jones et al. 2006)
- using prot microarrays to get a global pic of phosphorylated tyr binding sites
How was ErbB receptor binding investigated experimentally to get saturation curves? (Jones et al. 2006)
- overexp/purify all SH2 or pTB doms from bacterial culture
- spot out in microlite plates all doms as an array
- at same time, synthesise 17-19 residue peptides of all 33 identified potential tyr phos sites and add fluorescent tag
- probe array w/ all peptides at diff concs and measure fluorescence to get saturation curves (to show real binding)
How was data from ErbB receptor binding analysed? (Jones et al. 2006)
- analyse binding curves to provide quantitative pic of binding of ErbBs to prospective binding partners
- probe for binding of peptide at all the poss doms
- positions on microarray chip identified w/ a general fluorescent dye
- binding for each peptide at 8 diff concs identified via its fluorescent tag
In the ErbB receptor binding study, how was specific binding differentiated from non-specific? (Jones et al. 2006)
- single value measurement of binding not sufficiently reliable and fluorescence measure not enough to assess strength of interaction
- so calc apparent KD (eq dissoc constant) values for each binding dom and peptide combination
- specific interactions identified by fit of data points to curve varying KD and Fmax (where Fmax is the max fluorescence at saturation for each peptide tested)
- a KD value of <2 = binding and Fmax 2 fold higher than control set, indicates binding
What is being looked for when probs ErbB against specific pY residue? (Jones et al. 2006)
- binding curves of fluorescence vs [peptide] that match criteria for specific, high affinity interactions
What did the affinity threshold show in the ErbB receptor binding study? (Jones et al. 2006)
- reveals sensitivity of response to levels of the receptor in the cell
- at high affinity EGFR discriminated v well for certain binding prots over others
- at low affinity discrimination was less good
- variation in cellular levels for EGFR and ErbB2 have a greater effect than for ErbB3 (mirrored by observed lower variation for ErbB3 across human tissue types)
What plasma proteins were targeted in a proteomic study for clinical biomarker discovery, and why? (Scheiss et al. 2009)
- conc ranges of prots in plasma have v large dynamic range --> from mg/ml to pg/ml
- cannot tell if in disease state well by measuring high conc prots (constant prots)
- prots at lower conc are diagnostic and signalling
What scheme for biomarker discovery was used for plasma prots? (Scheiss et al. 2009)
- isolate N-glycopeptides by solid phase enrichment (SPEG), to find in vivo disease specific signatures --> use cells, tissue and finally blood plasma w/ MS based label-free quantification
- selected reaction monitoring (SRM) assays of these prot panels are set up, tested and then validated w/ human patients
- ideally want lots of patients and focus on a few key marker peptides
Why is looking at markers in blood advantageous for diagnosis? (Scheiss et al. 2009)
- quicker and less invasive method for patients
What did stage 1 of clinical biomarker dicovery involve for plasma prots? (Scheiss et al. 2009)
- control and cancer cell lines
- enrichment of glycoprots (eg. lectin capture)
- release prot eg. PNGaseF
Identification of glycoprots by MS/MS
- establishment of set of “biomarker” peptides
What did stage 2 of clinical biomarker dicovery involve for plasma prots? (Scheiss et al. 2009)
- analyse blood plasma
- enrich (targeted approach), release and process glycoprots
- quantify levels of key peptides (identified from prior studies)
- could be used on clinical level if developed
- found subsets occurred across specific cancer types
How was SILAC applied to analyse the haploid verse diploid yeast proteome?
- 3 strategies for in depth quantification of yeast proteome by SILAC labelling and high res MS
What did SILAC reveal about the haploid diploid ratio?
- quantitative diffs between the haploid and diploid yeast proteome show the overall fold changes --> ie. looking at haploid diploid ratio, for most prots doesn’t make a differences, but some are more and some are less
- need to make sense of this pattern to see if there is a consistent picture
How were the SILAC results of haploid diploid ratio deduced?
- members of yeast pheromone response can be colour coded according to fold change
- pheromone signalling is req for mating of haploid cells and is absent from diploid cells, the top 10 haploid specific prots are components or transcriptional targets of pheromone signalling
- LC-MS/MS w/ Orbitrap
How well did proteome and transcriptome correlate w/ changes of haploid versus diploid yeast?
- overall correlation was poor
- after filtering out low mRNA signals the data correlate better
What diff labels did a SILAC study use to study localisation and turnover of prots in HeLa cells?
- cells grown in 3 diff labelled states
--> Light = normal 12C and 14N Arg and Lys
--> Medium = 13C and 14N Arg and 2H Lys
--> Heavy = fully labelled 13C and 15N Arg and Lys
How did the MS/SILAC study investigate the localisation and turnover of prots in HeLa cells?
- LC-MS/MS w/ Orbitrap
- revealed separate peaks for each L/M/H form of every peptide and the relative amounts can be assessed
- initially cells grown in L medium, culture split and half grown in M medium to replace L prots
- then H medium pulsed into growth medium to replace M prots
- samples taken at increasing time intervals for comparison of original L culture and new M --> H culture
- separation of contents of subcellular regions (nucleus, cyto and nucleolus) gives additional spatial dimension to analysis (ie. fractionation)
In the SILAC pulse study how does the %M change over time, and what does this show?
- line shows degradation rate
- M:L ratio decreases
In the SILAC pulse study how does the %H change over time, and what does this show?
- line shows synthesis rate
- H:L ratio increases
In the SILAC pulse study what does the H:M ratio show?
- measures turnover rate
- point where 2 lines cross is often the quoted turnover no.
--> this is the 50% point (50% of M and of H)
How does the abundance of prots in HeLa cells differ?
- v large range of abundance (1x10^7)
- on av there is a few 1000 to 10,000 copies of a given prot
- S shaped curve (DIAG)
What prots are some of the least and most abundant in HeLa cells?
- TFs and RNA binding prots are among the least abundant
- histones among the most abundant
- nucleotide binding are among the most and least abundant prots
Why are RNA binding prots so low abundance?
- regulatory prots
Are most prots found in the nucleus, nucleolus and cyto, why?
- most prots partitioned into particular localisation
- v few are found equally across all 3 cellular compartments, but most are found in 2 locations (but rarely in all 3)
What does localisation of prots to diff compartments not tell you?
- dynamics of switch between locations, doesn't tell you how fast move between these locations, ie. its just a snapshot
Why were diff time points measured in the SILAC study on HeLa cells?
- look at distribution of prot turnover
In HeLa cells what was the distribution of prot turnover?
- most prots in range of 14-26 hours, w/ av of around 20 hours turnover
- but there were v slow and v fast prots as well
What prots have a slow turnover?
- large abundant complexes that need lots of energy to establish
- eg. translation and ic transport
What prots have a fast turnover?
- inc mitosis and cell cycle prots, as don't want them to hang around long time --> they are made, used and removed
In HeLa cells SILAC pulse experiment what did the fact that some residual M remained in the culture mean?
- implies recycling event, so couldn't replace it all
In HeLa cells how does the distribution of prot turnover vary between cellular compartments?
- generally minor peak around 10 hours, then major peak around 20 hours
- in the nucleolus there is trinomial distribution, ie. some are v rapidly turned over
- faster turnover in places where they are assembled than where they are used, eg. ribosomes want fast turnover in nucleolus, but want slow turnover in cyto where they are used
In HeLa cells are any prot characteristics related to turnover, and why might this be?
- v little trends
- more acidic prots expected to have faster turnover, but not true in HeLa cells
- but this study does not distinguish stages of cell cycle and done w/o tagging of prots
- also poor correl w/ mRNA levels
How was iTRAQ used to provide location data in live tissue samples? (Chen et al)
- need to identify proteomes w/in cells, eg. w/in mito, to fully understand specialised functions of these areas
- need to target pot tags for identification of prots from specific locations in the cell, eg. mitochondrial matrix
- target an engineered ascorbate peroxidase (APEX) to particular locations in cell
- use the APEX to add biotin onto any nearby surrounding prots and ensure only prots there are tagged
- use streptavidin column to pull out all biotin labelled prots
How were APEX constructs expressed in diff subcellular compartments of fly muscle cells? (Chen et al)
- GAL4 exp system inserted in particular desired region
- GAL4 binds UAS region v specifically
- UAS and signal peptide (to target APEX) are upstream of APEX
- signal peptide is NES, NLS or mito tag, so directs enz where want it to go in cell
- localisation confirmed by additional fluorescent tag
How was labelling of endogenous prots by APEX activity achieved? (Chen et al)
- reqs addition of biotin-phenol and H2O2
- enz generates reactive radical that adds biotin to nearby e- rich AAs such as tyr
- staining for APEX and biotin bound to streptavidin in Drosophila larval tissue showed tight overlap of stains in correct cellular compartment and no extra background noise
What was used to confirm APEX activity? (Chen et al)
- Western blot
What was MS/MS required for in regards to APEX activity? (Chen et al)
- identify and quantify the levels of prots stained by the targeted APEX activity
What did labelling enable w/ regards to MS/MS? (Chen et al)
- identification of APEX biotinylated prots vs any endogenous biotinylated prots
- removal of background or false +ves from non-specific binding of other prots to streptavidin
How did MS/MS idenitfy and quantify prot levels stained by targeted APEX levels? (Chen et al)
- calculate iTRAQ ratios of mitochondrial expression vs control for every identified prot in the sample
- set a false +ve rate threshold (based on prior prediction location from other data sources), so only prots 10x more likely to be in the matrix than a false +ve are considered
- gen a mitochondrial matrix proteome of 389 prots that can be compared to other data sets and poss new prots and remaining false +ves identified
What do indep studies of mitochondrial located prots show? (Chen et al)
- show good overlap
- however these studies also inc total mitochondrial prot and not just the specific matrix prots found in the APEX study
What samples were analysed during a MS based draft of the human proteome?
- adult and fetal tissue types
- as well as cell types
How was a MS based draft of the human proteome prod?
- 2 separate MS based analyses of samples taken t/o the human body and from cell culture
- first used FT-ICR and label free spectral counting
- second using LC-MS/MS on peptide digests following in solution fractionation or gel separation
What info does a MS based draft of the human proteome provide?
- provides maps/databases of the whole proteome
- and reveal tissue-specific and global features concerning the distribution and levels of the proteins
How was the human proteome database established?
- peptide based MS/MS analysis of samples from tissues/organs or cell lines
- combined w/ data from the literature (ie. other databases and colleagues)
- samples fractionated, digested and analysed on the high res and high accuracy
- using Orbitrap mass analyser
- tandem mass spectrometry data were searched against a known prot database using SEQUEST and MASCOT database search algorithms
What coverage of the human proteome was achieved in the database?
- approx 92% coverage of genes in human genome and approx 22% of the known isoforms
- high coverage (approx 90%) in MS experiments of previously identified samples
How many ubiquitous prots were found in the human proteome, and what is their role?
- for general control and maintenance
What is a problem w/ prot digests that can limit MS analysis and how is this resolved?
- some prots resistant to production of tryptic digests useful for MS analysis (eg. keratin)
- so other proteases such as chymotrypsin used to prod suitable peptides
What was observed by looking at the coverage across chroms of the human proteome?
- quite evenly spread and generally over 90%, apart from Y
- lowest coverage was of nasopharynx and uterus
- translation of long intervening non-coding RNAs (lincRNAs) is rare but was observed across all chroms and in many tissue types
Is the bulk of prot mass contributed by a large or small no. of genes?
- was clear from both studies contributed by only a small no. of genes
- only 2350 housekeeping genes account for approx 75% of proteome mass
If look at diff tissue types, is the levels of housekeeping genes constant, how?
- EGF varies quite a lot --> kidney cells which regenerate quite well need high levels to respond, but others such as bone need v low levels
- prot expression in diff tissues and cell lines showed that levels of housekeeping (GAPDH), signalling (EGFR) and tumour-assoc (CTNNB1) prots can vary substantially between tissues
What did principal component analysis (PCA) show about levels of housekeeping genes across diff tissues?
- cell lines retain the prot expression characteristics of their respective 1° tissue, and that proteomes of diff organs are more diverse
What can heat maps show?
- can reveal prots which show specific cell or tissue types exp (ie. only found in certain tissue types)
What are some of the diff strategies used when assembling the human proteome?
- look at coverage across chroms
- levels of housekeeping genes
- heat maps
- functional prot exp analysis
- look at RNA translation rate
- expression profiling
Why and how can RNA translation rates be used for predictions in the human proteome?
- constant across tissue types for any given prot
- translation rates correlate well w/ prot exp levels
- so prot level can be predicted well from the RNA transcript level for a given translation rate
What can expression profiling be used for in the human proteome, and what did it reveal?
- can predict compositions of large complexes and reveals unexpected distributions
- immunoproteasome is unexpectedly widespread
How was an Ab and transcriptome analysis of the human proteome carried out?
- quite a diff strategy to the previous ones
- > 24,000 Abs used on samples from 44 tissues (--> 13 mil tissue based immunohistochemistry images) and an additional RNA transcript analysis carried out on 32 of these tissues
- these were analysed en masse
- produced large database of precise info about Abs (which are v specific to the targets they bind)
In RNA-seq what is the expression of a transcript relative to?
- the no. of cDNA fragments that originate from it
What was seen by looking at co-expression of particular sets of prots?
- testis, brain and liver tissue are notably enriched in gene products
- lung, pancreas and adipose tissue low
What were the findings of classifying and looking at prot evidence of human prot coding genes?
- for most tissues, only approx 10% of transcripts are encoded by tissue elevated genes, w/ the exception of pancreas and liver (70% and 35%, respectively)
- of the most abundant genes, the prediction of the localisation of the corresponding prots reveal that many (53%) are secreted prots
- more than 70% of the transcripts from the pancreas, approx 60% from the salivary gland and approx 40% of the transcripts in liver encode secreted prots
What is a problem and an adv w/ classifying human prot coding genes based on transcript expression levels in 32 tissues?
- problem: often need lots of downstream experiments to find out what's actually going on from the big picture and 10% weren’t detected at all so coverage is not perfect
- adv: generally unbiased
What tissues show the highest fraction of mt genes encoded, and why?
- cardiac and skeletal muscle - correlates w/ energy metabolism
What is the predicted subcellular localisation of FDA approved drugs?
- 59% of the targets are predicted membrane prots and 16% are secreted, inc those w/ both secreted and mem bound isoforms
How were genes corresponding to drug targets of FDA approved drugs classified, and what did this show?
- classified according to tissue specificity
- showing a bias for tissue elevated prots, although as many as 30% of the approved drugs target prots exp in all analysed tissues
What is the issue w/ drugs having ubiquitous exp?
- may have implications for treatments using these prots as drug targets
- because if targeting a specific receptor and its present in many other tissues then it's not a good target due to off target effects
How many genes are implicated in the malignant transformation?
- 525 genes
How was the cancer proteome analysed and what was found?
- 60% of genes implicated in malignant exp are exp in all tissues, w/ only a fraction exp in a tissue or group-enriched manner
- lack of tissue specificity unsurprising due to involvement in normal growth regulation and cell cycle control
- but emphasised the poss adverse effects of treatment w/ drug targeting prots exp in all tissues
How does tissue vs cell line expression vary?
- many tissue enriched genes identified in normal tissues are down regulated or completely turned off in corresponding cell lines
- in contrast, the housekeeping prots are exp at the same level in both tissues and corresponding cell lines
- thus cell lines are ‘dedifferentiated’ w/ shared characteristics
What is the point of a subcellular map of the human proteome, rather than looking at tissues?
- can see where exp is w/in tissue
How was a subcellular map of the human proteome prod?
- subcellular locations of 12,003 prots determined by immunofluorescence microscopy using approx 14,000 Abs in cell lines of various origins
- enabled precise definition of location and revealed single cell variation in exp patterns
- unsure if genuine diff, as not uniform, but could be stochastic diff
- high resolution IF images mapped prots to distinct subcellular structures --> defined the proteomes of 13 major cellular organelles, revealing multi localising prots and the expression variability on a single cell level
Which organelles has the largest proteomes?
- the nucleus and its substructures, and the cytosol
What did smaller organelles show in terms of their proteome?
- such as the midbody and the nucleoli
- showed a larger diversity than previously recognised
What did the subcellular map of the human proteome show about where prots were localised?
- most prots didn’t occur in just 1 location
- 15% of the 12,003 detected prots showed a single cell variation, mainly in organelles that have cell cycle dep prots
- approx ½ of all prots were localised to multiple compartments, suggests a shared pool of prots even among functionally unrelated organelles
What is an advantage of ICPL labelling?
- samples joined together at early stage of sample prep
What is an eg. of using ICPL labelling?
- used for substrate identification of the ec protease ADAMTS1 using SDS-PAGE LC MS/MS
What is an eg. of using ICAT labelling?
- analysed tumour specific prots from aspirated fluid of breast tumour patients at early stages, suggesting pot apps to designate approp biomarkers for cancer diagnosis, eg. beta-globin was overexp
What is the only diff between iTRAQ and TMT labelling?
- the labelling specification and mol structure of label (equilibrium group)
What diff groups are iTRAQ and TMT labels comprised of?
- reporter group
- balancer group
- amine specific reactive group
How can absolute quantification be carried out for iTRAQ and TMT?
- by introducing known conc of a reference labelled peptide into sample and compare signals
- specific reference peptides available for particular types of proteomic experiment, eg. PTMs (as want them to be a good mimic for those likely to occur in sample
What is SILAC a vital technique for?
- secreted pathways and secreted prots in cell culture
What is an eg. of using SILAC labelling?
- 'dynamic SILAC' to look at prot turnover rate and tissue regen
- in zebrafish found fin, intestine and live have high regen capacity, whilst heart and brain have lowest
What is an eg. of how analytical microarrays have been used?
- to carry put high throughput analysis of cancer cells
What is an eg. of how functional microarrays have been used?
- 1st use was to analyse substrate specificity of prot kinases in yeast
What is an eg. of how reversed phase microarrays have been used?
- to determine alt/dysfunction prot indicative of certain disease
What does cell growth req in terms of prot levels?
- net increase in total prot and higher levels of translation than degradation
What is an eg. of how prot degrad can provide a flexible mech for rapid activation of gene exp in mammalian cells?
- p53 (tumour suppressor) continually synthesised, but also constantly rapidly degraded, so has low steady state levels in normal cell growth conditions
- upon oncogene activation degradation is prevented
What diff human proteome studies were there?
1) draft map
2) MS based draft
3) tissue based map
4) subcellular map
What is a major limitation of SILAC?
- adding isotope labels to living systems and therefore for higher order eukaryotes like humans, need to be able to culture cell lines representative of the whole organism
Can ICPL and ICAT be used for absolute quantification?
- yes, w/ reference peptides