Cloning and sequencing Flashcards Preview

NEUR2001 > Cloning and sequencing > Flashcards

Flashcards in Cloning and sequencing Deck (29)
Loading flashcards...
1
Q

What is Gibson Assembly (GA) ?

A

A novel method for the easy assembly of multiple linear DNA fragments. Regardless of fragment length or end compatibility, multiple overlapping DNA fragments can be joined in a single isothermal reaction. With the activities of three different enzymes, the product of a Gibson Assembly is a fully ligated double-stranded DNA molecule.

2
Q

What are the steps involved ?

A
  • Forming 3’ single stranded overhangs –> T5 exonuclease, but unstable at 50 ̊C and exposes 3’ ends after some time
  • Annealing complementary termini
  • Gap-filling –> Phusion High Fidelity DNA polymerase
  • Nick-sealing –>Taq DNA ligase
3
Q

What are the advantages of GA ?

A
  • Gibson assembly for cloning multiple fragment
  • Simultaneous assembly of up to 8 -10 fragments
  • Fragments can be short (=annealed oligos without PCR amplification)
  • Vector can be either PCR-amplified or linearized by restriction enzymes
4
Q

Which biological macromolecule was sequenced first?

A

Protein (Primary structure of insulin 1949-1951)
Proteins are made of linear polypeptides
“They (the proteins) seem to be put together in an order that is random, but nevertheless unique and most significant, since on it must depend the important physiological action of the hormone.” (from the acceptance speech for the Nobel prize 1958)

5
Q

What was the implication of the double helical structure (1953) for base sequences in DNA ?

A
  • Structure places no constrain on sequence
  • Suggests mechanism for faithful replication
  • Stage set to solve the “coding problem” (“Coding problem” solved before any DNA sequence was experimentally determined)
6
Q

What were the major challenges during the early days of sequencing ?

A
  • Different DNA molecules were chemical very similar separation difficult
  • Chain length of DNA much greater than for protein, complete sequencing seemed unapproachable
  • Amino acids have widely varying properties. only 4 bases for DNA
  • No base-specific DNAses were known
  • Protein sequencing has depended upon specific proteases
7
Q

What was the first DNA sequence obtained ?

By which method ?

A

Lambda cos ends – 12 bases – partial 1968 – complete 1971
Method: repair reaction from 3’OH end (E. coli polymerase) using radioactive nucleotides followed by partial nuclease degradation - isolation of the synthesized oligonucleotide - sequence determination

8
Q

What are the advantages/disadvantages of “Plus and Minus” Sanger sequencing (1975) ?

A
Advantage:
- Rapid and allowed sequencing phiX174
Disadvantage:
- Single stranded DNA
- Accuracy --> 8 reactions (4 for plus and 4 for minus in parallel) --> confirmatory data
9
Q

What are the advantages/disadvantages of the “Maxam and Gilbert” sequencing method (1977) ?

A
Advantage:
- Double stranded DNA
- 4 reactions are sufficient
Disadvantage:
- Strand separation
10
Q

What are the advantages/disadvantages of chain termination or dideoxy sequencing (Sanger sequencing – 1977) ?

A

Based on finding from Atkinson et al. (1969) that ddTTP can inhibit DNA polymerase I
Advantage:
- Increased accuracy
- 4 reactions (only)
Disadvantage:
- ssDNA (phage M13 later also alkaline denaturation)

11
Q

What are the requirements for chain-termination sequencing of DNA ?

A

Single-stranded DNA molecule (template) to be sequenced
Oligonucleotide (primer) complementary to upstream region of template DNA polymerase
DNA synthesis reaction is performed
- Primer/template mix distributed to four tubes
- 4 dNTPs plus DNA polymerase
- a radioactive dNTP in one of four tubes add ddATP
- in a second add ddCTP
- in the third ddGTP
- in the fourth ddTTP

12
Q

By adjusting the ratio of dNTP to ddNTP it is possible to generate the full spectrum of terminated products with approximately equal representation. How ?

A
  • too much ddNTP gives preferentially short products

- too little ddNTP gives preferentially long products

13
Q

Fo Sanger sequencing, how are the reaction products denatured and electrophoresed ?
How are bands detected ?
How many nt could be read per 4 lanes ?

A
  • reaction products are denatured by adding formamide and heating to separates the newly synthesized radioactive strands from the template
  • samples are loaded on to a thin polyacrylamide gel
    containing urea and separated by electrophoresis at high voltage (~2500 V)
  • the thin gels allow a high resolution of DNA molecules:
    one base different in length
  • plates can be heated to keep the DNA denatured
  • gel is fixed and dried and exposed to X-ray film to reveal the chain terminated products as bands
    Sequencing reads are initially ~100 nt per 4 lanes which increased with improvements to the label (35S replaced 32P) and the gel (wedged, longer gels, shark tooth comb) system to ~350 nt
14
Q

Why was dye terminator sequencing more advantageous than conventional Sanger sequencing ?

A

Automation of DNA sequencing (Hood Caltech & Applied Biosystems (ABI) Initially a primer was fluorescently end labelled with 4 different dyes –> different primer was used in each of the 4 dideoxy sequencing reactions
This was later replaced by the dye-terminator sequencing, where each of the dideoxy nucleotides is labelled with a different coloured fluorescent tag - one reaction
Sequencing reactions are performed in a PCR machine in a technique called cycle sequencing

15
Q

What were the advantages of automating the dye terminator sequencing method ?

A
  • Visualization of sequences obtained with dye-termination
  • Increase the number of lanes up to 96
    Increase length of gel
  • Bioinformatic: “gel tracking” & sequence extraction
  • Replacing the gel by capillary electrophoresis
  • Separation by size based on their total charge (5-20 kV)
  • Polymer solution replaces need of manually poured polyacrylamide gels
  • Automatic sample loading (Electrokinetic injection)
  • ABI Prism 3700 with 96 capillaries (1998) produced read length of up to 1000 nt per capillary per run
16
Q

What are the main problem of Sanger sequencing ?

A
  • Gels or polymers as separation media
  • Limited number of sequences handled in parallel
  • Difficulties in complete automation of sample preparation
17
Q

What solution was found to resolve the Sanger sequencing problem ?

A
  • Next generation sequencing (2005) characterized by an increase in parallel handling of samples
  • Shorter reads for single read
  • Less accuracy for single read
  • Higher degree of sequence coverage make the final sequence highly accurate
  • Genome sequencing that took several years with Sanger methods can now be completed in weeks (2003 - 1st time sequencing of human genome $3 billion and 13
    years – 2014 close to $1000 in days))
18
Q

What is pyrosequencing ?

A

Pyrosequencing is a method of DNA sequencing (determining the order of nucleotides in DNA) based on the “sequencing by synthesis” principle, in which the sequencing is performed by detecting the nucleotide incorporated by a DNA polymerase.

19
Q

How does pyrosequencing work ?

A
  • DNA synthesis: a dNTP is attached to the 3’ end of the growing DNA strand. The two phosphates are released as pyrophosphate (PPi).
  • ATP sulfurylase quantitatively converts PPi and adenosine 5’-phosphosulfate (APS) to ATP.
  • The synthesized ATP drives the luciferase mediated conversion of luciferin to oxyluciferin.
  • The produced light is proportional to the amount of ATP and thereby proportional to the amount of nucleotides added to the DNA.
  • The light is detected by a CCD camera and seen as a peak in the Pyrogram.
20
Q

How many enzymes are needed for pyrosequencing and in what phases are they ?

A

Solid phase: 3 enzymes
Liquid phase: 3 enzymes + apyrase
Apyrase hydrolyses ATP and unincorporated dNTP, which switches off the light production.

21
Q

How does pyrosequencing give us any information about the specific nucleotides added by DNA polymerase ?

A
  • The four dNTPs are added one at a time, with apyrase degradation / washing in between.
  • The amount of light released is proportional to the number of dNTPs added. Thus, if two adenines are added in a row, twice as much PPi is realeased and twice as much light is produced compared with adding only one
    adenine.
  • The 454 sequencer cycles between the 4 dNTPs to build up the sequence. About 300-700 nt of sequence can be
    read, which is shorter and less accurate compared to the average read length obtained with the Sanger methods
22
Q

What is 454 sequencing ?

How does it work ?

A

Generation of the sequencing library :
- DNA is sheared (300-800 bp) and ends are “blunted”
- Two different adapters are added to each end of fragment –> one adapter is complementary to oligonucleotide on sequencing bead. The ratio of beads to DNA molecules is controlled so that most beads get only a single DNA attached to them.
- Oil is added to the beads and an emulsion is created. PCR is then performed, with each aqueous droplet
forming its own micro-reactor. Each bead ends up coated with about a million identical copies of the original
DNA.
- Generation of millions of clonally amplified sequencing templates on each bead
- No cloning and colony picking

23
Q

What are the pros and cos of 454 sequencing ?

A
  • No cloning : Fast and no bias towards particular clones
  • The average substitution error rate is in the range of 10-3–10-4 higher than rates observed for Sanger sequencing but low average substitution error rate for next generation sequencing so far.
  • In vitro amplifications performed for the sequencing preparation cause a higher background error rate
  • Bead preparation: a fraction of the beads end up carrying copies of multiple different sequences.
  • A large fraction of errors are insertion/deletion errors (InDels): inaccurate calling of homopolymer lengths, single base-pair deletions or insertions due to signal noise issue.
  • Strong light signal in one well of picotiter plate give incorrect reading in neighboring plate
  • Reading of later position in sequence less accurate (reduction of enzyme activity or its loss)
  • Phasing: not all molecules in the ensemble are extended in every cycle => loss of sychrony/phase
    results in echo and is add as noise to signal.
24
Q

What is the illumina sequencing method ?

A
  • This method uses the basic Sanger idea of “sequencing by synthesis” of the second strand of a DNA molecule.
  • Starting with a primer, new bases are added one at a time
  • Fluorescent tags show which base was added.
  • Fuorescent tags block the 3’-OH and next base can only be added after tag is removed.
  • The cycle is repeated 50-100 times
25
Q

What are the 4 steps in illumina library generation ?

A
  1. Prepare genomic DNA sample : Randomly fragment genomic DNA and ligate 2 different adapters to both ends of the fragment.
  2. Attach DNA to surface : Surface of flow cell has on its surface two populations of immobilized oligonucleotides complementary two the adapter ends.
  3. Denature the double stranded molecules
  4. Bridge amplification : Add unlabled nucleotides and enzyme to initiate solid-phase bridge amplification
26
Q

What are the 3 steps in illumina data acquisition ?

A
  1. 1st chemistry cycle –> determine the 1st base : to initiate the 1st sequencing cycle, add all 4 labeled reversible terminators, primers and DNA plymerase to the flow cell
  2. Image of 1st chemistry cycle : after laser excitation, capture the image of emitted fluorescance from each cluster on the flow cell and record the identity of the 1st base for each cluster
  3. Before initiating the next chemistry cycle : the blocked 3’ terminus and the fluorophore from each incorporated base are removed
27
Q

What are the pros and cos of illumina sequencing ?

A
  • No cloning : Fast and no bias towards particular clones
  • Library and flow cell preparation includes in vitro amplification steps resulting in an error rate in the range of 10^-2–10^-3
  • Initially read length of 26 nt meanwhile 100 nt.
  • Reading of later position in sequence less accurate dues to phasing (bi-directional phasing).
  • Simultaneous detection of four different fluorescent dyes with similar emission spectra contributes to error
  • Despite higher error rate and considerable shorter read length 5000 Mb/day for about 0.5$/Mb are
    obtained
28
Q

How does nanopore sequencing work ?

A

A protein nanopore is set in an electrically resistant polymer membrane. An ionic current is passed through the nanopore by setting a voltage across this membrane. If an analyte passes through the pore or near its aperture, this event creates a characteristic disruption in current. Measurement of that current makes it possible to identify the molecule in question. For example, this system can be used to distinguish between the four standard DNA bases G, A, T and C, and also modified.

29
Q

What perspectives are there for the future of nanopore technology ?

A

Oxford Nanopore Technologies has achieved single-nucleotide reading with biological nanopores.
In 2013, a machine with an array of 2000 individual nanopores was presented. The company states that each
nanopore delivers a read length is 5-100kb at a rate of 150 Mb per hour for up to 6 hours.
With 20 of these machines working in parallel, the company claims, the human genome could be sequenced in 15 minutes.