Week 3 Flashcards
(42 cards)
Sequencing genomes
resulted in a shift from studying single or a few genes to studying all genes simultaneously
proteome and transcriptome
Genome
all DNA and identification of all DNA elements (transcriprion units)
Transcriptome
all transcripts expressed (list plus analysis of expression)
Proteome
all proteins expressed (list plus analysis and modification)
Large scale ORF finder
Looking for open reading frames in the bacteria
Simple for bacteria because of the fact that DNA contains the coding region that is not interrupted.
So you can go from DNA to the protein coding capacity of that DNA very simply.
We can’t do the same for the eukaryotic DNA.
Eukaryotes and ORF finders
not for most eukaryotes, we can’t go from the eukaryotic genome to the eukaryotic proteome that simply
Splicing
We need the transciptome to get the proteome of the genome.
transcriptome is
all expressed RNA: mRNA rRNA tRNA siRNA miRNA non coding RNA snRNA crRNA snoRNA
eukaryotic mRNA
exstensively processed
5’ prima cap
AUG first codon of ORF
Messenger RNAs are processed with the additon of a poly-A-tail that helps us annotate the proteosome
reverse transcriptase
the DNA copy is made with reverse trancriptase which requires a DNA primer. A common approach is to use an oligo dT primer that hybridizes with the poly A tail. therefore the total transcriptome is not represented
Before nanopore only DNA could be sequenced so RNA always had to be turned into a complementary DNA copy.
post translational processing
a barrier to annotating the genome
a primary transcript is processed, splicing, poly-a-tail and cap
Therefore anytime we make a complementary DNA copy we’re making a complementary copy of the mature mRNA after the intronic sequences are removed.
A large amount of the genome is not expressed: intragenic regions which are not trasncirbed, intronic regions that are transcribed but spliced out.
post translational processing
a barrier to annotating the genome
a primary transcript is processed, splicing, poly-a-tail and cap
Alternative Splicing
Genes undergo alternative splicing, when you align different cDNA sequences to the genome you find that some genes that these aligments are quite different from one cDNA to another
indicating that they came from transcripts that have undergone alternative splicing
This gene produces six distinct messenger rna transcripts.
That encode three distinct polypeptides.
When you align this sequence to drosophila DNA you ifnd six different patterns of alignments due to six different splicing patterns of the mRNA transcripts.
Alternaitve splicing increases the number of proteins that can be encoded by a single gene.
Types of splicing
alternative poly-a-tail sites alternative promoters Exon included or excluded Mutually exclusive inclusion. Alternative 5’ splice sites. Alternative 3’ splice sites. Retained intron
In some messages splicing occurs such that the intron remains in the mature mRNA, in other the mature mRNA the intron is removed.
RNA seq Two major goals
Count the relative number of transcripts in the sample.
Determine the structure of the transcripts in the sample.
Often done after they’ve converted the RNA to complementary DNA and sequenced the complementary DNA.
How do we get distinct cell types
differential gene expression
sc RNA seq goals
To determine the poly A+ transcriptome of individual cells
Useful in the study of development and human disease
sc RNA seq function
1-In drop single cell seq, suspension of cells, microparticles and lysis buffer,
2-mixed in a microfluorodics apparatus and encased into droplets by using oil, oil droplet contains a cell and microparticle
3-lysis buffer in the droplet lyses the cell releasing rna/dna
4-the poly adenlyted RNA is hybridized to a primer on the microparticle that contains oligodT.
5-Barcoded primer beads contains a unique sequence barcode sequence between the PCR handle and an oligodt tail
6-break droplets, reverse transcription with template switching formation of STAMPs
7-STAMPs are amplified by PCR so these microparticles that have these individuals cell barcodes and transcriptome attached are amplified. the amplified fragments are synthesized.
8-Generation of paired end reads.
One read goes through the cell barcode the other read goes through the cDNA
Illumina
Because they’re paired end reads we can tell which cell this cDNA comes from by read one having the cell barcode
9-Even though we are sequencing a complex PCR product from a multidtude of different stamps, each one of those microparticles had a distinct cell barcode that we can use to identify which cell the paired end reads came from.
Organize the data and ask which trasncripts are expressed in cell one.
Determine what genes are expressed in the cell and to what level., count the number of observed trancripts.
Changing pattern of gene expression through development
When you start off as a single cell you have one transcriptome, but as the cells specialize during development you start to get expression of different patterns of genes in the each cell.
Zebra fish development, single cell RNA seq during development on each cell and looking fro changes in gene expression (sequence transcriptome)
Each point is a cell, change in gene expression of the cells as they differentiate.
Proteome
Catalogue of the proteins expressed by an organism?
What proteins are unique to an organism or shared?
what is the function of the protein?
Information encoded in the genome.
All of the proteins encoded within a genome.
Function can be determined by taking advantage of the relatedness of all organisms.
What proteins are unique to an organism or shared?
all life is related; genes are shared
homologous genes can fall into two categories:
- orthologs
- paralogs
orthologs/paralogs
homologous genes in different species. have the same common ancestor
homologous genes in the same species. result of a duplication of a gene
What was the function of the protein?
if an orthologous protein is well characterized in one organism then it may be reasonable to proopose that all the orhtologous proteins share its function.
Complex proteins often contain conserved protein domains of known function like DNA binding for example. Therefore, conserved domains can suggest the biochemical function of the protein in the proteome.
Interactome
proteins interact with one another either in stabke complexes like RNA polymerase, or via transient interactions like initiation factors for translation.
An interactome is the result of a systematic analysis of the proteome.
Systematic analysis of interactomes
1-Yeast two hybrid screen
2-affinity purification and mass spectrometry