Flashcards in Lecture 22 - Genomics and Bacterial Evolution Deck (50):
Work out how a protein works from the genetic code, and experimental data.
Study of a single genome
Study of several genomes
What did Fred Sanger initially sequence?
Size of PhiX
Sanger sequencing method
1) Break sequence of interest into fragments
2) Place in test tube with dideoxynucleotides, each with an individual dye. ddnucleotides terminate chain elongation
3) Run fragments on a polyacrylomide gel, which can resolve to individual base pair level
4) The dye colour of nucleotides is read
Read size of sanger sequencing
The length of a single piece of DNA that can be sequenced by a particular method
Reads are placed together, according to consensus sequences.
This forms a contig, which is a sequence of reads
Where read sequences overlap, make a sequence of consensus sequences
When a computer can't find a match in reads to make a contig
Why can gaps occur?
1) DNA polymerase can't extend sequence for some reason
2) If there is a repeated region, and the read size is smaller than the size of the repeat.
Automated Sanger sequencing method
1) Break DNA of interest into fragments
2) Adaptors of known sequence are added, ligate to the ends of dsDNA
3) A glass slide is prepared, with sequences complementary to primers adhered to surface
4) Hybridisation of primers, adhered complementary sequences
5) Add unlabelled nucleotides, DNA polymerase. Bridge amplificaiton
6) DNA synthesis, bridges become double stranded
7) Denaturation, to ssDNA
8) PCR to make high-density DNA clusters
9) Bases tagged with fluorescent dyes added. When a base is added, emits fluorescence which is detected.
Key difference between Sanger and Illumina
Illumina sequencing can continue on same strand after dye-tagged base is added.
Fluorescent part is cleaved off when base is incorporated, so it doesn't interfere with further elongation
MiSeq output per run
NextSeq500 output per run
HiSeq2500 output per run
MiSeq read number
NextSeq500 read number
HiSeq2500 read number
MiSeq read length
NextSeq500 read length
HiSeq2500 read length
MiSeq time for run
Most inexpensive sequencing method
PacificBio RS output per run
PacificBio RS read number
PacificBio RS read length
What is PacificBio RS?
1) Single molecule, real time sequencing
2) DNA synthesis by immobilised DNA polymerase
3) Phospholinked nucleotides release light when incorporated
4) No amplification
5) Under 180 minutes per run
PacificBio RS method
1) Don't fragment DNA of interest too much (reduces read length)
2) Repair ends
3) Adaptor ligation to DNA ends
4) DNA is polymerised by DNA polymerase fixed in a 0-mode waveguide well
5) When a phosphonucleotide is incorporated, light is emitted and detected. Each base has a different dye, and emits a different wavelength of light
Size of wells used in PacificBio RS
Why is it better to not have an amplification stage in sequencing?
Not all DNA is amplified at equal levels.
This can affect results
What are long read lengths useful for?
For complex sequences of DNA, such as repeat regions.
Sanger output per run
Read number of sanger
Sanger run time
PacificBio RS run time
30 minutes - 3 hours
Sanger cost per Mb
Illumina cost per Mb
PacificBio RS cost per Mb
A process which locates genes in a genome map
How to annotate a genome
1) Identify open reading frames
2) Experimentally identify gene function, or compare to other genes
Open reading frame
Over 100 codons that are uninterrupted by a stop codon.
See if there is an obvious ribosomal binding site at the 5' end, terminator sequence at 3' end
1) Analysis of a genome using computers
2) Generates information of genome structure, content, arrangement
3) Uses annotation to determine location of genes on newly-sequenced genome
Significance of an open reading frame
Presumed to encode a protein
Basic local alignment search tool
A tool used in bioinformatics
Compares primary sequence information from different genomes