Lecture 10: Genome sequencing technology Flashcards
(37 cards)
What? First generation sequencing:
one sequence at a time
- eg. Sanger sequencing
WHAT? Second generation sequencing:
massively parallel sequencing of fragments of different sequences
eg. Illumina sequencing
WHAT? Third generation sequencing:
long read massively parallel sequencing
eg. Pacific Biosystems and Nanopore
The cost of sequencing has fallen 10,000x in past
decade - MOORE’S LAW
Moore’s Law: the
number of transistors on a chip doubles every two years while the costs are halved. …
SEQUENCE OF TECHNOLOGY DISCOVERY
2005:
- Automation of first generation sequencing,
‘Next generation sequencing’ and Pacific Biosciences
2007:
illumina
2008:
SOLID/454
2010:
Ion torrent
2015:
Nanopore
2022:
Ultima ($100 genome)
Cost per Genome from $100M to $100 (Moore’ Law)
slide 3
First generation sequencing: the Sanger method (1970’s on)
- what is required? What occurs?
-
- Based on ACTION of DNA POLYMERASE
- Requires TEMPLATE DNA
- DNA PRIMER
- POLYMERASE
- NUCLEOTIDES
- SMALL AMOUNT OF NUCLEOTIDE ANALOG included.
– The INCORPORATION OF THE ANALOGUE TERMINATES SYNTHESIS
- SMALL AMOUNT OF NUCLEOTIDE ANALOG included.
Historical note: the first human genome was sequenced with Sanger at great cost!
SANGER SEQUENCING REACTION:
What is it? What is the process? What is needed? - 5
1 * Chain-termination method
2 * Uses ‘dideoxy nucleotides’
3 * WhenADDED IN RIGHT AMOUNT,
the CHAIN IS TERMINATED EVERY TIME THAT BASES APPEARS IN TEMPLATE
4 * Need a reaction for each
base: A, T, C, and G
- EXAMPLE OF FIRST GEN. SEQUENCING
Sanger Sequencing reaction: EXAMPLE
deoxyribose - HO
dideoxyribose - H
- cannot form a bond with the next base
Template
3’ ATCGGTGCATAGCTTGT 5’
Sequence reaction products
5’ TAGCCACGTATCGAACA* 3’
5’ TAGCCACGTATCGAA* 3’
5’ TAGCCACGTATCGA* 3’
5’ TAGCCACGTA* 3’
5’ TAGCCA* 3’
5’ TA* 3’
SEQUENCE SEPARATION:
- Gel electrophoresis = 7
- TERMINATED chains need to be SEPARATED
-
- Requires ONE-BASE-PAIR RESOLUTION
- See difference between chain of ‘X and X+1 base pairs’
- Gel electrophoresis
5. * Very THIN GEL
6. * HIGH VOLTAGE
7.* WORKS WITH RADIOACTIVE OR FLUORESCENT LABELS
- Gel electrophoresis
figure on slide 6
Sanger Sequencing reaction:
‘Capillary electrophoresis’ (1998) - what is it and what does it possess? 4
- AUTOMATED SEQUENCERS used very THIN CAPILLARY TUBES
- USED FLUORESCENCE, NO RADIOACTIVITY
3 * Run all 4 FLUORESCENTLY TAGGED REACTIONS is SAME CAPILLARY
4 * Can have 384 CAPILLARIES RUNNING at the SAME TIME.
FIGURE ON SLIDE 7:
- Robotic arm and syringe
- load bar
- 96 glass capillaries
- 96-well plate
Sanger Sequencing reaction
How to SEQUENCE READING OF FLUORESCENTLY LABELED REACTIONS = 4
1.* Fluorescently labeled
reactions SCANNED BY LASER as a PARTICULAR POINT IS PASSED
- COLOUR PICKED UP by
DETECTOR
- COLOUR PICKED UP by
3 * OUTPUT sent DIRECTLY to COMPUTER
- NB. BIG INCREASE IN SEQUENCING EFFICIENCY AND DECREASE IN COST
- figure on slide 8
1. Dye-labeled dideoxynucleotides are used to generate DNA fragments of different lengths
- Graph
PROS = 2
CONS = 3
FOR SANGER SEQUENCING
- Cons
1. Requires MANY COPIES OF TEMPLATE (plasmid, or amplified PCR product)
- Requires a KNOWN SEQUENCE AT THE 5’ or 3’ END (to design a primer against)
- LIMITED LENGTH for each SEQUENCE RUN (usually max ***‘1kb’ sequenced
PROS
1. CHEAP
2. QUICK
Best applications for sanger: 2
- ‘SEQUENCE INSERTS’ contained WITHIN PLASMIDS AND AMPLICONS
- CHECK for SUCCESSFUL MUTAGENESIS OF KNOWN INSERT.
Second generation Sequencing Technologies
(early 2000’s on): 8
- Massively PARALLEL SEQUENCING OF DNA FRAGMENTS
2 * Many DIFFERENT STRATEGIES – MOST USE DNA POLYMERASE PRIMER EXTENSION (similar to Sanger)
- DIFFERENCES to sanger:
- TEMPLATE PREPARATION, 5. SEQUENCING CHEMISTRY,
- DETECTION OF NUCLEOTIDES
- ‘Illumina’ is the MAIN PLATFORM USED
- Many other platforms have been and gone, and new ones still emerging
Second generation Seq Workflow: 5
1 * Next gen sequencing
does NOT REQUIRE DNA TO BE ‘CLONED’
- can you use DNA or RNA–>DNA
2 * DNA is FRAGMENTED
3 * ADAPTERS are ADDED
TO EACH END
4 * PCR is used to make a
LIBRARY
- …SEQUENCE LIBRARY INSERT…READ ALIGNMENT/ASSEMBLY
5 * Massive parallel sequencing of the library
Illumina – massively parallel
sequencing in a flow cell… libraries? = 3
- DNA libraries are loaded
onto a ‘flow cell’
- DNA libraries are loaded
- Individual DNA molecules are dispersed
- These sequences are
amplified to form clusters
(each cluster contains
identical DNA)
- These sequences are
Illumina - Template Prep
(cluster generation) - process = 7
- Illumina/Solexa SOLID PHASE AMPLIFICATION
—- 2. ONE DNA MOLECULE PER CLUSTER - Sample preparation DNA (5 microg)
- TEMPLATE dNTPs and POLYMERASE
5 BRIDGE AMPLIFIFICATION
- 100-200 MILLION MOLECULAR CLUSTERS
- CLUSTER GROWTH
Figure on slide 13.
THE FULL PROCESS: Illumina sequencing: step by step = 7
- The sequencing occurs as SINGLE-NUCLEOTIDE ADDITION REACTIONS
because of a BLOCKING GROUP AT THE 3-OH POSITION OF RIBOSE SUGAR. - STEP 1 The nucleotide is added by polymerase,
- STEP 2 unincorporated nucleotides are washed away
- STEP 3 the flow cell is imaged to identify each cluster that is reporting a fluorescent signal
- STEP 4 the fluorescent groups are chemically cleaved
- STEP 5 the 3-OH is ‘chemically deblocked’
- This series of steps is repeated for up to ‘150 NUCLEOTIDE ADDITION REACTIONS’
Illumina – how the sequencing works with reversible terminators = 4
- INCORPORATE ALL 4 NUCLEOTIDES, EACH LABEL WITH A DIFFERENT DYE’
- WASH, 4 COLOUR IMAGING
- CLEAVE DYE AND TERMINATING GROUPS, WASH
- ….. REPEAT CYCLES
- LOOK AT FIGURE ON SLIDE 15 FOR PROPER PROCESS
Illumina – reversible terminators Detection
Imaging of fluorescent tags over cycles
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 —>CATCGT
CCCCCC
FIGURE ON SLIDE 16 SHOWS
ILLUMINA CAPACITY -
Different machines have different capacity flowcells = 4
- Different machines have different capacity flowcells.
- Simplest models can do ‘25 million per flowcell’
3 * Intermediate models up to ‘250 million per flowcell’
4 * High end ‘20 B reads’
Capacity, multiplexing:
Illumina Multiplexing: ‘What is Multiplexing? What is so special about Illumina?’ = 5
- Multiplexing: Combining different samples in the same flow cell
- You probably don’t need 250 million reads for 1 sample!
…3. Mix 10 samples and get 25 million reads for each
- You probably don’t need 250 million reads for 1 sample!
- Multiplexing involves adding a unique sequence ‘barcode’ in each library preparation.
….5. This marks each read to indicate which sample it belongs to.
- Multiplexing involves adding a unique sequence ‘barcode’ in each library preparation.
Single end vs paired end sequencing
WHAT IS SINGLE-ENDED SEQUENCING?
Each species of DNA is represented by one read
- Linear DNA sequenced towards flow cell
diagram on slide 18
Single end vs paired end sequencing
WHAT IS PAIRED-ENDED SEQUENCING?
Permits sequencing of each end of one species of DNA
- Arc shape into flow cell
- cut into seq1 and seq 2 (linear) in flow cell
‘Insert length will be equal to the length of the strand between site A1 and A2’
diagram on slide 19
Pros/cons of illumina
PROS = 3
CONS = 3
PRO:
1.Massively parallel sequencing
- Low error rate
- Multiplexing libraries
CONS:
1.Expensive to buy and run (partly due to lack of competitors)
- Limited sequence length
- Library preparation needed