Lecture 10: Genome sequencing technology Flashcards

(37 cards)

1
Q

What? First generation sequencing:

A

one sequence at a time

  • eg. Sanger sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

WHAT? Second generation sequencing:

A

massively parallel sequencing of fragments of different sequences

eg. Illumina sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

WHAT? Third generation sequencing:

A

long read massively parallel sequencing

eg. Pacific Biosystems and Nanopore

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The cost of sequencing has fallen 10,000x in past
decade - MOORE’S LAW

A

Moore’s Law: the
number of transistors on a chip doubles every two years while the costs are halved. …

SEQUENCE OF TECHNOLOGY DISCOVERY

2005:
- Automation of first generation sequencing,
‘Next generation sequencing’ and Pacific Biosciences

2007:
illumina

2008:
SOLID/454

2010:
Ion torrent

2015:
Nanopore

2022:
Ultima ($100 genome)

Cost per Genome from $100M to $100 (Moore’ Law)
slide 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

First generation sequencing: the Sanger method (1970’s on)

  • what is required? What occurs?
A
    • Based on ACTION of DNA POLYMERASE
      • Requires TEMPLATE DNA
    1. DNA PRIMER
    2. POLYMERASE
    3. NUCLEOTIDES
    • SMALL AMOUNT OF NUCLEOTIDE ANALOG included.
      – The INCORPORATION OF THE ANALOGUE TERMINATES SYNTHESIS

Historical note: the first human genome was sequenced with Sanger at great cost!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

SANGER SEQUENCING REACTION:

What is it? What is the process? What is needed? - 5

A

1 * Chain-termination method

2 * Uses ‘dideoxy nucleotides’

3 * WhenADDED IN RIGHT AMOUNT,
the CHAIN IS TERMINATED EVERY TIME THAT BASES APPEARS IN TEMPLATE

4 * Need a reaction for each
base: A, T, C, and G

  1. EXAMPLE OF FIRST GEN. SEQUENCING
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sanger Sequencing reaction: EXAMPLE

A

deoxyribose - HO

dideoxyribose - H
- cannot form a bond with the next base

Template
3’ ATCGGTGCATAGCTTGT 5’

Sequence reaction products
5’ TAGCCACGTATCGAACA* 3’
5’ TAGCCACGTATCGAA* 3’
5’ TAGCCACGTATCGA* 3’
5’ TAGCCACGTA* 3’
5’ TAGCCA* 3’
5’ TA* 3’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SEQUENCE SEPARATION:
- Gel electrophoresis = 7

A
    • TERMINATED chains need to be SEPARATED
    • Requires ONE-BASE-PAIR RESOLUTION
      • See difference between chain of ‘X and X+1 base pairs’
    • Gel electrophoresis
      5. * Very THIN GEL
      6. * HIGH VOLTAGE
      7.* WORKS WITH RADIOACTIVE OR FLUORESCENT LABELS

figure on slide 6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sanger Sequencing reaction:

‘Capillary electrophoresis’ (1998) - what is it and what does it possess? 4

A
    • AUTOMATED SEQUENCERS used very THIN CAPILLARY TUBES
    • USED FLUORESCENCE, NO RADIOACTIVITY

3 * Run all 4 FLUORESCENTLY TAGGED REACTIONS is SAME CAPILLARY

4 * Can have 384 CAPILLARIES RUNNING at the SAME TIME.

FIGURE ON SLIDE 7:
- Robotic arm and syringe
- load bar
- 96 glass capillaries
- 96-well plate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sanger Sequencing reaction

How to SEQUENCE READING OF FLUORESCENTLY LABELED REACTIONS = 4

A

1.* Fluorescently labeled
reactions SCANNED BY LASER as a PARTICULAR POINT IS PASSED

    • COLOUR PICKED UP by
      DETECTOR

3 * OUTPUT sent DIRECTLY to COMPUTER

  1. NB. BIG INCREASE IN SEQUENCING EFFICIENCY AND DECREASE IN COST
  • figure on slide 8
    1. Dye-labeled dideoxynucleotides are used to generate DNA fragments of different lengths
  1. Graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

PROS = 2

CONS = 3

FOR SANGER SEQUENCING

A
  • Cons
    1. Requires MANY COPIES OF TEMPLATE (plasmid, or amplified PCR product)
  1. Requires a KNOWN SEQUENCE AT THE 5’ or 3’ END (to design a primer against)
  2. LIMITED LENGTH for each SEQUENCE RUN (usually max ***‘1kb’ sequenced

PROS
1. CHEAP
2. QUICK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Best applications for sanger: 2

A
    • ‘SEQUENCE INSERTS’ contained WITHIN PLASMIDS AND AMPLICONS
    • CHECK for SUCCESSFUL MUTAGENESIS OF KNOWN INSERT.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Second generation Sequencing Technologies
(early 2000’s on): 8

A
    • Massively PARALLEL SEQUENCING OF DNA FRAGMENTS

2 * Many DIFFERENT STRATEGIES – MOST USE DNA POLYMERASE PRIMER EXTENSION (similar to Sanger)

    • DIFFERENCES to sanger:
  1. TEMPLATE PREPARATION, 5. SEQUENCING CHEMISTRY,
  2. DETECTION OF NUCLEOTIDES
    • ‘Illumina’ is the MAIN PLATFORM USED
    • Many other platforms have been and gone, and new ones still emerging
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Second generation Seq Workflow: 5

A

1 * Next gen sequencing
does NOT REQUIRE DNA TO BE ‘CLONED’
- can you use DNA or RNA–>DNA

2 * DNA is FRAGMENTED

3 * ADAPTERS are ADDED
TO EACH END

4 * PCR is used to make a
LIBRARY
- …SEQUENCE LIBRARY INSERT…READ ALIGNMENT/ASSEMBLY

5 * Massive parallel sequencing of the library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Illumina – massively parallel
sequencing in a flow cell… libraries? = 3

A
    • DNA libraries are loaded
      onto a ‘flow cell’
    • Individual DNA molecules are dispersed
    • These sequences are
      amplified to form clusters
      (each cluster contains
      identical DNA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Illumina - Template Prep
(cluster generation) - process = 7

A
  1. Illumina/Solexa SOLID PHASE AMPLIFICATION
    —- 2. ONE DNA MOLECULE PER CLUSTER
  2. Sample preparation DNA (5 microg)
  3. TEMPLATE dNTPs and POLYMERASE

5 BRIDGE AMPLIFIFICATION

  1. 100-200 MILLION MOLECULAR CLUSTERS
  2. CLUSTER GROWTH

Figure on slide 13.

17
Q

THE FULL PROCESS: Illumina sequencing: step by step = 7

A
  1. The sequencing occurs as SINGLE-NUCLEOTIDE ADDITION REACTIONS
    because of a BLOCKING GROUP AT THE 3-OH POSITION OF RIBOSE SUGAR.
  2. STEP 1 The nucleotide is added by polymerase,
  3. STEP 2 unincorporated nucleotides are washed away
  4. STEP 3 the flow cell is imaged to identify each cluster that is reporting a fluorescent signal
  5. STEP 4 the fluorescent groups are chemically cleaved
  6. STEP 5 the 3-OH is ‘chemically deblocked’
  7. This series of steps is repeated for up to ‘150 NUCLEOTIDE ADDITION REACTIONS’
18
Q

Illumina – how the sequencing works with reversible terminators = 4

A
  1. INCORPORATE ALL 4 NUCLEOTIDES, EACH LABEL WITH A DIFFERENT DYE’
  2. WASH, 4 COLOUR IMAGING
  3. CLEAVE DYE AND TERMINATING GROUPS, WASH
  4. ….. REPEAT CYCLES
  • LOOK AT FIGURE ON SLIDE 15 FOR PROPER PROCESS
19
Q

Illumina – reversible terminators Detection

A

Imaging of fluorescent tags over cycles

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 —>CATCGT

CCCCCC

FIGURE ON SLIDE 16 SHOWS

20
Q

ILLUMINA CAPACITY -

Different machines have different capacity flowcells = 4

A
  1. Different machines have different capacity flowcells.
    • Simplest models can do ‘25 million per flowcell’

3 * Intermediate models up to ‘250 million per flowcell’

4 * High end ‘20 B reads’

21
Q

Capacity, multiplexing:

Illumina Multiplexing: ‘What is Multiplexing? What is so special about Illumina?’ = 5

A
  1. Multiplexing: Combining different samples in the same flow cell
    • You probably don’t need 250 million reads for 1 sample!
      …3. Mix 10 samples and get 25 million reads for each
    • Multiplexing involves adding a unique sequence ‘barcode’ in each library preparation.
      ….5. This marks each read to indicate which sample it belongs to.
22
Q

Single end vs paired end sequencing

WHAT IS SINGLE-ENDED SEQUENCING?

A

Each species of DNA is represented by one read

  • Linear DNA sequenced towards flow cell
    diagram on slide 18
23
Q

Single end vs paired end sequencing

WHAT IS PAIRED-ENDED SEQUENCING?

A

Permits sequencing of each end of one species of DNA

  • Arc shape into flow cell
  • cut into seq1 and seq 2 (linear) in flow cell

‘Insert length will be equal to the length of the strand between site A1 and A2’

diagram on slide 19

24
Q

Pros/cons of illumina

PROS = 3

CONS = 3

A

PRO:
1.Massively parallel sequencing

  1. Low error rate
  2. Multiplexing libraries

CONS:
1.Expensive to buy and run (partly due to lack of competitors)

  1. Limited sequence length
  2. Library preparation needed
25
Best applications for illumina = 3
1. * Genomic DNA sequencing of known organisms (eg. Human genomes) 2 * RNA-seq (ie human transcriptome) 3 * Small scale sequencing (miseq) to thousands of human genomes at once (Novaseq)
26
Second generation seq - Ion Torrent 'pH change detection- Life Technologies Inc' SET UP =
1. Prepared Library 2. Library loaded onto Ion Spheres 3. Ion Semiconductor Sequencing Chip ....4. IONS SPHERES CAPTURED IN WELL ON SEMI-CONDUCTOR CHIP 5. Ion PGM ...6. Sequencing .. SEQUENCE LIBRARY by pH CHANGE DETECTION
27
Ion Torrent sequencing procedure = 4
1. The 4 bases are flooded into the wells, ONE AT A TIME 2. Polymerase Integrates a Nucleotide 3.If the base is incorporated, a pH change is recorded: 4. Hydrogen and Pyrophosphate are RELEASED DIAGRAM ON SLIDE 22
28
Third generation seq – 'Pacific biosciences' WHAT IS IT? WHAT DOES IT DO? = 6
1. * No need to use PCR to make a LIBRARY (single molecule sequencing) 2. * As with second generation, ADAPTERS ADDED TO EACH END of DNA fragments 3. * Specialising in LONG SEQUENCES (~5k long) 4. * A SINGLE LON G MOLECULE isCOPIED BY A POLYMERASE ANCHORED AT THE BOTTOM OF A WELL. ....5. EACH WELL has ONE POLYMERASE within it. 6. * After a BASE IS ADDED a FLUOROPHORE IS CLEAVED OFF THE BASE, AND THAT IS DETECTED WITHIN THE WELL.
29
How does 'Pac Bio' get low sequence error rate? = 3
1. Because of the unique nature of the ‘Bell end’ adapters…. 2. Each INDIVIDUAL MOLECULE gets sequenced an AVERAGE OF 30 TIMES... 3. SO A CONSENSUS SEQUENCE CAN BE BUILT UP
30
How does Pac Bio get low sequence error rate? DIAGRAM = 5
1. Double stranded DNA from 'SMRT Bell cDNA Library' 2. PacBio Sequencing 3. Template DNA with DNA Polymerase 4. LOW ACCURACY Raw Reads 5. Ligh Consensus accuracy (>98%)
31
Pros/cons of PacBio PROS = 2 CONS = 3
PROS 1. No Need to Make a Library (Single Molecule Reads) 2. Generates Long Sequences CONS 1. SLOW 2. NOT WIDELY AVAILIABLE 3. EXPENSIVE
32
Best applications for PacBio = 3
1 * Sequence DNA or RNA of organism with no known reference genome (‘de novo’ sequencing) 2 * Identify new alternative splice forms of transcripts 3 * Sequence transcribed pseudogenes (with only tiny sequence differences to parent gene)
33
Third generation seq - Oxford Nanopore WHAT IS IT? WHAT DOES IT DO? = 6
1. Worlds first MOBILE DNA SEQUENCER 2 * Plugs into your LAPTOP and runs off a USB 3 * Thousands of PROTEIN PORES IN THE MEMBRANE, EACH WITH A HOLE IN THE MIDDLE 4 * SINGLE DNA molecules, WITH ADAPTERS at the end, PASS THROUGH PORE IN MEMBRANE 5. * EACH NUCLEOTIDE SLIGHTLY DIFFERENT CHARGE 6 * CHARGE DETECTED AS NUCLEOTIDES PASS THROUGH MEMBRANE
34
Pros/cons of Nanopore PROS = 3 CONS = 2
PROS 1. CHEAP 2. QUICK 3. PORTABLE CONS 1. HIGH ERROR RATE 2. LOW THROUGHPUT
35
Best applications for Nanopore = 3
1 * Detection of alternative spliced isoforms of transcripts 2 * Detect large structural rearrangements 3 * DNA sequencing ‘in the field
36
Current state of Sequencing technology = 3
1 * Illumina is used by most people, it is quick, accurate and easy. BUT it is expensive – the machines and the reagents. 2 * PacBIO is used for people who need very long reads. - This is basically for genomics of organisms with nothing close to a reference genome 3. * Oxford Nanopore/minION has great promise. - It’s portability means it could bring DNA sequencing to the masses
37