DNA sequencing Flashcards
(46 cards)
What is DNA sequencing?
DNA sequencing is the process of determining the order of nucleotides (A, T, C, G) in a DNA molecule, which is critical for understanding gene function, mutations, and evolutionary relationships.
Why is DNA sequencing important?
It helps study genes, identify mutations causing diseases, diagnose infections, understand evolutionary relationships, and trace ancestry.
What are the three generations of DNA sequencing?
First generation: Sanger sequencing (up to 1000 bases)
Second generation: Next-generation sequencing (NGS, short reads, millions of fragments at once)
Third generation: Long-read sequencing (reads over 10,000 bases).
What is Sanger sequencing and how does it work?
Sanger sequencing uses chain-terminating nucleotides to stop DNA synthesis at random points, creating fragments that can be read using capillary electrophoresis and fluorescence.
What are the key steps in Sanger sequencing?
Denature DNA into single strands
Add primer and polymerase for DNA extension
Incorporate chain-terminating nucleotides (ddNTPs) randomly
Separate fragments by size and visualize with fluorescence.
What are the pros and cons of Sanger sequencing?
Pros: Simple data analysis, accurate, longer reads (up to 1000 bases), affordable for small sample sizes.
Cons: Low throughput, impractical for large numbers of samples, requires significant DNA input.
What is next-generation sequencing (NGS)?
NGS involves sequencing millions to billions of DNA fragments simultaneously, generating short reads (~500 bases), and assembling them into larger sequences.
What is the process for preparing samples for NGS?
DNA is fragmented into 1 KB pieces, then adapters are added to each fragment for amplification and sequencing using machines like Illumina
What are the advantages of NGS?
It can sequence large numbers of DNA fragments quickly and is ideal for samples with low DNA quantity or for sequencing entire genomes.
How does third-generation sequencing (long-read) differ from previous methods?
Third-generation sequencing generates long reads (>10,000 bases), providing more accurate assembly of repetitive regions and larger genomes.
What is the purpose of adaptor sequences in NGS?
To anchor DNA fragments to a flow cell and allow primers to bind for sequencing.
What is “bridge amplification”?
PCR-based amplification of immobilised DNA fragments to form clusters.
Why are DNA fragments amplified into clusters in Illumina?
To generate strong fluorescent signals for detection.
What is “sequencing by synthesis”?
DNA polymerase adds fluorescently labelled, chain-terminating nucleotides one at a time.
What are paired-end vs. single-end reads?
Paired-end sequences both ends of DNA; single-end sequences one end.
What happens during preprocessing of reads?
Adaptors are removed, low-quality ends are trimmed, and poor reads are filtered.
What is de novo assembly?
Building a genome from overlapping short reads without a reference genome.
What is reference mapping?
Aligning reads to a known genome to reconstruct a sample’s sequence.
Key Illumina pros?
High accuracy (~0.01% error), cost-effective.
Key Illumina cons?
Short reads (~200–300 bp), poor resolution in repetitive regions, large data storage needs.
Difference between genome, exome, and targeted panel sequencing?
Whole genome: all DNA; exome: only expressed genes; targeted panel: specific known genes.
Key feature of long-read sequencing?
Sequences very long fragments (10 kb to >1 Mb), often single molecules.
What is SMRT sequencing?
Polymerase reads circular DNA in real time using fluorescent bases in zero-mode waveguides.
Benefit of circular DNA templates?
Enables multiple passes of the same DNA for error correction.