Lecture 10: Genome sequencing technology Flashcards

1
Q

What? First generation sequencing:

A

one sequence at a time

  • eg. Sanger sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

WHAT? Second generation sequencing:

A

massively parallel sequencing of fragments of different sequences

eg. Illumina sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

WHAT? Third generation sequencing:

A

long read massively parallel sequencing

eg. Pacific Biosystems and Nanopore

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The cost of sequencing has fallen 10,000x in past
decade - MOORE’S LAW

A

Moore’s Law: the
number of transistors on a chip doubles every two years while the costs are halved. …

SEQUENCE OF TECHNOLOGY DISCOVERY

2005:
- Automation of first generation sequencing,
‘Next generation sequencing’ and Pacific Biosciences

2007:
illumina

2008:
SOLID/454

2010:
Ion torrent

2015:
Nanopore

2022:
Ultima ($100 genome)

Cost per Genome from $100M to $100 (Moore’ Law)
slide 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

First generation sequencing: the Sanger method (1970’s on)

  • what is required? What occurs?
A
    • Based on ACTION of DNA POLYMERASE
      • Requires TEMPLATE DNA
    1. DNA PRIMER
    2. POLYMERASE
    3. NUCLEOTIDES
    • SMALL AMOUNT OF NUCLEOTIDE ANALOG included.
      – The INCORPORATION OF THE ANALOGUE TERMINATES SYNTHESIS

Historical note: the first human genome was sequenced with Sanger at great cost!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

SANGER SEQUENCING REACTION:

What is it? What is the process? What is needed? - 5

A

1 * Chain-termination method

2 * Uses ‘dideoxy nucleotides’

3 * WhenADDED IN RIGHT AMOUNT,
the CHAIN IS TERMINATED EVERY TIME THAT BASES APPEARS IN TEMPLATE

4 * Need a reaction for each
base: A, T, C, and G

  1. EXAMPLE OF FIRST GEN. SEQUENCING
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sanger Sequencing reaction: EXAMPLE

A

deoxyribose - HO

dideoxyribose - H
- cannot form a bond with the next base

Template
3’ ATCGGTGCATAGCTTGT 5’

Sequence reaction products
5’ TAGCCACGTATCGAACA* 3’
5’ TAGCCACGTATCGAA* 3’
5’ TAGCCACGTATCGA* 3’
5’ TAGCCACGTA* 3’
5’ TAGCCA* 3’
5’ TA* 3’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SEQUENCE SEPARATION:
- Gel electrophoresis = 7

A
    • TERMINATED chains need to be SEPARATED
    • Requires ONE-BASE-PAIR RESOLUTION
      • See difference between chain of ‘X and X+1 base pairs’
    • Gel electrophoresis
      5. * Very THIN GEL
      6. * HIGH VOLTAGE
      7.* WORKS WITH RADIOACTIVE OR FLUORESCENT LABELS

figure on slide 6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sanger Sequencing reaction:

‘Capillary electrophoresis’ (1998) - what is it and what does it possess? 4

A
    • AUTOMATED SEQUENCERS used very THIN CAPILLARY TUBES
    • USED FLUORESCENCE, NO RADIOACTIVITY

3 * Run all 4 FLUORESCENTLY TAGGED REACTIONS is SAME CAPILLARY

4 * Can have 384 CAPILLARIES RUNNING at the SAME TIME.

FIGURE ON SLIDE 7:
- Robotic arm and syringe
- load bar
- 96 glass capillaries
- 96-well plate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sanger Sequencing reaction

How to SEQUENCE READING OF FLUORESCENTLY LABELED REACTIONS = 4

A

1.* Fluorescently labeled
reactions SCANNED BY LASER as a PARTICULAR POINT IS PASSED

    • COLOUR PICKED UP by
      DETECTOR

3 * OUTPUT sent DIRECTLY to COMPUTER

  1. NB. BIG INCREASE IN SEQUENCING EFFICIENCY AND DECREASE IN COST
  • figure on slide 8
    1. Dye-labeled dideoxynucleotides are used to generate DNA fragments of different lengths
  1. Graph
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

PROS = 2

CONS = 3

FOR SANGER SEQUENCING

A
  • Cons
    1. Requires MANY COPIES OF TEMPLATE (plasmid, or amplified PCR product)
  1. Requires a KNOWN SEQUENCE AT THE 5’ or 3’ END (to design a primer against)
  2. LIMITED LENGTH for each SEQUENCE RUN (usually max ***‘1kb’ sequenced

PROS
1. CHEAP
2. QUICK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Best applications for sanger: 2

A
    • ‘SEQUENCE INSERTS’ contained WITHIN PLASMIDS AND AMPLICONS
    • CHECK for SUCCESSFUL MUTAGENESIS OF KNOWN INSERT.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Second generation Sequencing Technologies
(early 2000’s on): 8

A
    • Massively PARALLEL SEQUENCING OF DNA FRAGMENTS

2 * Many DIFFERENT STRATEGIES – MOST USE DNA POLYMERASE PRIMER EXTENSION (similar to Sanger)

    • DIFFERENCES to sanger:
  1. TEMPLATE PREPARATION, 5. SEQUENCING CHEMISTRY,
  2. DETECTION OF NUCLEOTIDES
    • ‘Illumina’ is the MAIN PLATFORM USED
    • Many other platforms have been and gone, and new ones still emerging
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Second generation Seq Workflow: 5

A

1 * Next gen sequencing
does NOT REQUIRE DNA TO BE ‘CLONED’
- can you use DNA or RNA–>DNA

2 * DNA is FRAGMENTED

3 * ADAPTERS are ADDED
TO EACH END

4 * PCR is used to make a
LIBRARY
- …SEQUENCE LIBRARY INSERT…READ ALIGNMENT/ASSEMBLY

5 * Massive parallel sequencing of the library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Illumina – massively parallel
sequencing in a flow cell… libraries? = 3

A
    • DNA libraries are loaded
      onto a ‘flow cell’
    • Individual DNA molecules are dispersed
    • These sequences are
      amplified to form clusters
      (each cluster contains
      identical DNA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Illumina - Template Prep
(cluster generation) - process = 7

A
  1. Illumina/Solexa SOLID PHASE AMPLIFICATION
    —- 2. ONE DNA MOLECULE PER CLUSTER
  2. Sample preparation DNA (5 microg)
  3. TEMPLATE dNTPs and POLYMERASE

5 BRIDGE AMPLIFIFICATION

  1. 100-200 MILLION MOLECULAR CLUSTERS
  2. CLUSTER GROWTH

Figure on slide 13.

17
Q

THE FULL PROCESS: Illumina sequencing: step by step = 7

A
  1. The sequencing occurs as SINGLE-NUCLEOTIDE ADDITION REACTIONS
    because of a BLOCKING GROUP AT THE 3-OH POSITION OF RIBOSE SUGAR.
  2. STEP 1 The nucleotide is added by polymerase,
  3. STEP 2 unincorporated nucleotides are washed away
  4. STEP 3 the flow cell is imaged to identify each cluster that is reporting a fluorescent signal
  5. STEP 4 the fluorescent groups are chemically cleaved
  6. STEP 5 the 3-OH is ‘chemically deblocked’
  7. This series of steps is repeated for up to ‘150 NUCLEOTIDE ADDITION REACTIONS’
18
Q

Illumina – how the sequencing works with reversible terminators = 4

A
  1. INCORPORATE ALL 4 NUCLEOTIDES, EACH LABEL WITH A DIFFERENT DYE’
  2. WASH, 4 COLOUR IMAGING
  3. CLEAVE DYE AND TERMINATING GROUPS, WASH
  4. ….. REPEAT CYCLES
  • LOOK AT FIGURE ON SLIDE 15 FOR PROPER PROCESS
19
Q

Illumina – reversible terminators Detection

A

Imaging of fluorescent tags over cycles

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 —>CATCGT

CCCCCC

FIGURE ON SLIDE 16 SHOWS

20
Q

ILLUMINA CAPACITY -

Different machines have different capacity flowcells = 4

A
  1. Different machines have different capacity flowcells.
    • Simplest models can do ‘25 million per flowcell’

3 * Intermediate models up to ‘250 million per flowcell’

4 * High end ‘20 B reads’

21
Q

Capacity, multiplexing:

Illumina Multiplexing: ‘What is Multiplexing? What is so special about Illumina?’ = 5

A
  1. Multiplexing: Combining different samples in the same flow cell
    • You probably don’t need 250 million reads for 1 sample!
      …3. Mix 10 samples and get 25 million reads for each
    • Multiplexing involves adding a unique sequence ‘barcode’ in each library preparation.
      ….5. This marks each read to indicate which sample it belongs to.
22
Q

Single end vs paired end sequencing

WHAT IS SINGLE-ENDED SEQUENCING?

A

Each species of DNA is represented by one read

  • Linear DNA sequenced towards flow cell
    diagram on slide 18
23
Q

Single end vs paired end sequencing

WHAT IS PAIRED-ENDED SEQUENCING?

A

Permits sequencing of each end of one species of DNA

  • Arc shape into flow cell
  • cut into seq1 and seq 2 (linear) in flow cell

‘Insert length will be equal to the length of the strand between site A1 and A2’

diagram on slide 19

24
Q

Pros/cons of illumina

PROS = 3

CONS = 3

A

PRO:
1.Massively parallel sequencing

  1. Low error rate
  2. Multiplexing libraries

CONS:
1.Expensive to buy and run (partly due to lack of competitors)

  1. Limited sequence length
  2. Library preparation needed
25
Q

Best applications for illumina = 3

A
    • Genomic DNA sequencing of known organisms (eg. Human genomes)

2 * RNA-seq (ie human transcriptome)

3 * Small scale sequencing (miseq) to thousands of human genomes at once (Novaseq)

26
Q

Second generation seq - Ion Torrent

‘pH change detection- Life Technologies Inc’

SET UP =

A
  1. Prepared Library
  2. Library loaded onto Ion Spheres
  3. Ion Semiconductor Sequencing Chip
    ….4. IONS SPHERES CAPTURED IN WELL ON SEMI-CONDUCTOR CHIP
  4. Ion PGM
    …6. Sequencing .. SEQUENCE LIBRARY by pH CHANGE DETECTION
27
Q

Ion Torrent sequencing procedure = 4

A
  1. The 4 bases are flooded into the wells, ONE AT A TIME
  2. Polymerase Integrates a Nucleotide

3.If the base is incorporated, a pH change is recorded:

  1. Hydrogen and Pyrophosphate are RELEASED

DIAGRAM ON SLIDE 22

28
Q

Third generation seq –
‘Pacific biosciences’

WHAT IS IT? WHAT DOES IT DO? = 6

A
    • No need to use PCR to make a LIBRARY (single molecule sequencing)
    • As with second generation, ADAPTERS ADDED TO EACH END of DNA fragments
    • Specialising in LONG SEQUENCES (~5k long)
    • A SINGLE LON G MOLECULE isCOPIED BY A POLYMERASE ANCHORED AT THE BOTTOM OF A WELL.
      ….5. EACH WELL has
      ONE POLYMERASE within it.
    • After a BASE IS ADDED a FLUOROPHORE IS CLEAVED OFF THE BASE, AND THAT IS DETECTED WITHIN THE WELL.
29
Q

How does ‘Pac Bio’ get low sequence error rate? = 3

A
  1. Because of the unique nature of the ‘Bell end’
    adapters….
  2. Each INDIVIDUAL MOLECULE gets sequenced an AVERAGE OF 30 TIMES…
  3. SO A CONSENSUS SEQUENCE CAN BE BUILT UP
30
Q

How does Pac Bio get low sequence error rate? DIAGRAM = 5

A
  1. Double stranded DNA from ‘SMRT Bell cDNA Library’
  2. PacBio Sequencing
  3. Template DNA with DNA Polymerase
  4. LOW ACCURACY Raw Reads
  5. Ligh Consensus accuracy (>98%)
31
Q

Pros/cons of PacBio

PROS = 2

CONS = 3

A

PROS
1. No Need to Make a Library (Single Molecule Reads)

  1. Generates Long Sequences

CONS
1. SLOW

  1. NOT WIDELY AVAILIABLE
  2. EXPENSIVE
32
Q

Best applications for PacBio = 3

A

1 * Sequence DNA or RNA of organism with no known reference genome (‘de novo’ sequencing)

2 * Identify new alternative splice forms of transcripts

3 * Sequence transcribed pseudogenes (with only tiny sequence differences to parent gene)

33
Q

Third generation seq - Oxford Nanopore

WHAT IS IT? WHAT DOES IT DO? = 6

A
  1. Worlds first MOBILE DNA SEQUENCER

2 * Plugs into your LAPTOP and runs off a USB

3 * Thousands of PROTEIN PORES IN THE MEMBRANE,
EACH WITH A HOLE IN THE MIDDLE

4 * SINGLE DNA molecules, WITH ADAPTERS at the
end, PASS THROUGH PORE IN MEMBRANE

    • EACH NUCLEOTIDE SLIGHTLY DIFFERENT CHARGE

6 * CHARGE DETECTED AS NUCLEOTIDES PASS THROUGH MEMBRANE

34
Q

Pros/cons of Nanopore

PROS = 3

CONS = 2

A

PROS
1. CHEAP

  1. QUICK
  2. PORTABLE

CONS
1. HIGH ERROR RATE

  1. LOW THROUGHPUT
35
Q

Best applications for Nanopore = 3

A

1 * Detection of alternative spliced isoforms of transcripts

2 * Detect large structural rearrangements

3 * DNA sequencing ‘in the field

36
Q

Current state of Sequencing technology = 3

A

1 * Illumina is used by most people, it is quick, accurate and easy. BUT it is expensive – the machines and the reagents.

2 * PacBIO is used for people who need very long reads. - This is basically for genomics of organisms with nothing close to a reference genome

    • Oxford Nanopore/minION has great promise.
      - It’s portability means it could bring DNA sequencing to the masses
37
Q
A