Genomes and Genome Sequencing Flashcards
(95 cards)
Application of studying genomic
Research
Health (e.g. diagnostic)
Environment (e.g. pollutants)
Agriculture (e.g. livestock, nutrients)
Health example for genomics
Causes of severe intellectual disability in children (42% of cases linked to DNA compared to 12% using other methods)
Disease example for genomics
Inflammatory Bowel Disease (Crohn’s disease)
more viral DNA = more viruses
viruses were bacteriophages
they infected gut bacteria and affected gut bacteria population -> Crohn’s disease
Disease Outbreak Tracking for genomics (only need one)
Ebola - finding point of origin, watching it change over time
HIV - identified known origin, identified species crossovers
Influenza - track current outbreaks of influenza to inform vaccine choices for coming winter in opposite hemisphere/ identify crossover/ crossover potential for strains
The third generation of DNA sequences
Longer DNA sequences
Sanger Sequencing
Chain termination sequencing
Uses DDNTPs (fluorescently labelled nucleotides)
How does Sanger Sequencing work
polymerase rebuilds double helix using normal nucleotides, then randomly adds a fluorescently labelled base, polymerase stops and sequence cut at that point
-> strands of DNA of varying lengths, each ending with a fluorescently-labelled base
(* as many times req. so substitute each base in length)
Then run small pieces on capillary electrophoresis gel
Record fluorescence
Each base is a diff. colour
Downsides of Sanger Sequencing
Slow
Expensive
Not high throughput
Errors in repetitive regions (lots of bases similar to each other, next to each other)
Bias in sequencing (certain regions better amplified than others)
Library Preparation
Extract DNA from cells
Fragment DNA (50-1000bp)
Add adaptors (either end of seq.) one will stick to seq., other will be start point for seq. reaction
Amplification
Issues with Library Preparation
Bias in amplification
How does Illumina Sequencing work?
Fragements added to the flow cell - bind to flow cell (adapter-flow cell)
Polymerases starts at top (furthest from flow cell) and add in fluorescently labelled nucelotides (randomly, on at a time)
+laser excitation, fluorescence recorded
Benefits of Illumina Sequencing
Fast
Cheap
High throughput
Issues with Illumina Sequencing
Repetitive regions
Amplification
Length resistrictions
Third generation sequencing
prevent length resistriction
take out need to amplification
PacBio SMRT
uses Single Molecule, Real-time Technology
Zero-mode wave-guides
One piece of DNA per well
Polymerase in well adds fluorescence like Illumina to single piece of DNA
PacBio Considerations
Higher error rates
No need for amplification
Longer, but not genome-length
Oxford Nanopore Minion
Very small
Membrane with many pores
Feeds single length of DNA through pore, changes in electrical current along membrane indicates base, this is read
Oxford Nanopore MinION
Very small
Membrane with many pores
Feeds single length of DNA through pore, changes in electrical current along membrane indicates base, this is read
Oxford Nanopore MinION Consideration
Does not use fluorescently-labelled nucleotides
Not as accurate as Illumina (99.9%), but close (95%)
Long read (up to 2 million bp)
What is the Prometheon?
48 MinIONS
large amounts of sequencing
Single-Cell Sequencing
uses Illumina
BUT with diff. lib preparation - single-cell
Each cell in a ‘gem’ - when gel broken open all contents labelled with barcode for indv. gem
Can say where DNA comes from -> cell types/spatial transcriptomics
Challenges to genome projects
Sequencing technologies not perfect (e.g. Illumina 99.9% not 100%)
Some DNA harder to seq. than others (e.g. centromere/telomere) - secondary structures
Population representation (variation)
Gaps. errors, lack of variation
Accuracy of assemblage
Genomes keep being corrected (diff. versions from same individual)
Alignments
Reference genome available
Compare and align
Assembly
Does not have an available reference genome
Assemble reads into a reference genome
Is a BEST REPRESENTATION not exact