01: DNA Sequencing Flashcards

(45 cards)

1
Q

true/false The “health and ancestry” commercial DNA analysis available to the public are for whole genome sequencing rather than genotyping

A
  • false
  • the other way around
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

true/false Most current methods of manipulating DNA, RNA, and proteins rely on prior
knowledge of the nucleotide sequence of the genome of interest

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the most widely used method to determine nucelotide sequences in a genome of interest

A
  • dideoxy sequencing
  • aka sanger sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is used in sanger sequencing

A
  • DNA polymerase
  • dideoxyribonucleoside triphosphates (special-terminating nucelotides)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how does sanger sequencing work

A
  • they produce a collection of different DNA copies that terminate at every position in the original DNA sequence
  • these are then visualized to see where each nucleotides are
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the key difference between how sanger sequencing used to work, and how it does now

A
  • originally 4 diff sequencing reactions were performed, each w a diff dideoxyribonucleotide
  • the DNA copies were labeled with radioactivity
    and separated on polyacrylamide gels
  • these were then exposed to film to produce
    four ladders of bands that were read manually to reveal the sequence
  • now robotic devices mix the reagents, including the four different chain-terminating dideoxyribonucleotides,
  • each one is tagged with a different-coloured fluorescent dye
  • these are loaded onto capillary gels, which separate the reaction products into
    a series of distinct bands
  • A detector then records the colour of each band, and a computer translates the information into a nucleotide sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Automated dideoxy sequencing was used to determine the nucleotide sequences of which genomes

A
  • e coli
  • fruit flies
  • nematode worms
  • humans
  • many others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Due to _______ the cost of sequencing DNA has decreased dramatically, and the number of sequenced genomes has increased enormously

A

“second-generation sequencing technologies”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what do second-generation sequencing technologies allow us to do

A
  • multiple genomes to be sequenced in a matter of weeks
  • catalog the variation in nucleotide sequences from people around the world
  • uncover the mutations that increase the risk of various diseases, from cancer to autism
  • made it possible to determine the genome sequence of extinct species
  • helped us understand the molecular basis
    of key evolutionary events in the tree of life
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the most common second-generation sequencing method

A

illumina sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how does illumina sequencing work

A
  • begins with the construction of libraries of small DNA fragments that represent the entire genome
  • this is made via PCR amplification
  • it is done in a way that keeps all of the produced DNA fragments close to the original fragment
  • sequencing is done with chain-terminating nucleotides w uniquely coloured fluorescent tags
  • DNA polymerase adds the fluorescent nucleotide
  • a photo of the reaction records the colour to reveal the identity of the nucleotide that was added
  • coloured label and chain-terminating group are removed, allowing the polymerase to add the next nucleotide
  • this cycle is repeated hundreds of times
  • the computer stiches together all the fragments, using the overlaps between them as guides, to reconstruct the full genome sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

true/false similar to conventional dideoxy sequencing, the fluorescent tag and the chemical group that blocks elongation are both removable in illumina sequencing

A
  • False
  • this is true for illumina, but not for dideoxy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is special about third-generation sequencing methods

A

capable of sequencing much longer DNA molecules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the 2 promising third-generation sequencing methods

A
  • single-molecule real-time (SMRT) sequencing
  • Nanopore sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

describe single-molecule real-time (SMRT) sequencing

A
  • carried out in an array of tiny wells, each containing a single DNA polymerase anchored to the bottom
  • it uses deoxyrubonucleoside triphosphates where the fluorescent dye is attached to the terminal phosphate
  • as the polymerase copies the template DNA, the binding of a fluoresent nucleotide generated a colour signal to allow us to identify it
  • the signal disappears when the terminal phosphate is released during its incorporation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

true/false it is possible to use circular DNA templates that are sequenced repeatedly on both strands with single-molecule real-time (SMRT) sequencing

A
  • true
  • this greatly improves the accuracy of the
    resulting sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

describe nanopore sequencing

A
  • involves the transport of a single-strand DNA molecule through a tiny protein pore in a membrane
  • voltage is applied across the membrane, resulting in current through the pore
  • the passage of the nucleotides through the pore results in tiny shifts in electrical current across the membrane
  • measurement of these tiny current changes reveals the identity of each nucleotide
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

which form of sequencing does not require DNA synthesis

A

nanopore sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

using which sequencing methods can very long DNAs be sequenced

A
  • SMRT
  • nanopore
20
Q

what are unique advantages to nanopore sequencing, that do not exist with SMRT

A
  • can identify modified nucleotides
  • their effect on the current differs slightly from that of the unmodified
  • can be performed with portable, handheld instruments that can be taken into the field
21
Q

in SMRT sequencing, how are circular DNA templates used

A
  • by attaching hairpin adaptor DNAs to each end of the DNA to be sequenced
  • a primer is used that matches the adaptor
  • an enzyme called strand-displacing polymerase separates the double-stranded DNA as it moves along the template, allowing it to continue around the entire molecule many times
22
Q

what allows the experimenter to eliminate sequence errors that arise from random mistakes made by the polymerase.

A

the fact that both strands of the DNA are sequenced repeatedly

23
Q

true/false sequencing genomes has gotten more expensive with these new methods

A
  • false
  • its gotten cheaper
24
Q

how is RNA sequencing done as of right now

A
  • by converting the RNA to cDNA (via reverse transcriptase)
  • and then one of the methods we’ve learnt about for DNA
25
what is a valuable tool for annotating genomes
RNA-seq
26
**true/false** long strings of nucleotides, at first glance, reveal nothing about how this genetic information directs the development of a living organism
true
27
what does the process of genome annotating attempt to do
- attempts to mark out all the genes (both protein-coding and noncoding) in a genome and ascribe a role to each - also tries to understand the more subtle types of genome information
28
what is an example of the more subtle types of genome information
- the *cis*-regulatory sequences that specify the time and place that a given gene is expressed - whether its mRNA undergoes alternative splicing to produce diff protein isotopes
29
what is the first step in trying to make sense of a genome sequence
to translate *in silico* the entire genome into protein
30
how many different reading frames are there for any piece of double-stranded DNA
6
31
how many different reading frames are there for any piece of single-stranded DNA
3
32
what are open reading frames (ORFs)
protein coding regions, with much longer stretches without stop codons (longer than 20 AA)
33
open reading frames (ORFs) often signify what
bona fide protein coding genes
34
how is the determination of an ORFs typically double-checked
- by comparing the ORF AA sequence to the many databases of documented proteins from other species - if a match is found (even imperfect) then its very likely that the ORF will code for a functional protein
35
when does the "double-checking" strategy work best
for compact genomes (where introns are rare and ORFs extend for many hundreds of AA)
36
when does the "double-checking" strategy not work too well
- since it works best w compact genomes, when it's not compact it;s not as effective - the average exon size is 150–200 nucleotide pairs for many animals and plants, and additional information is usually required to unambiguously locate all the exons of a gene
37
what do we do when the genome is not compact, and we want to sequence it
- *can search genomes for splicing signals and other features to help identify codons* - most powerful method though is to sequence all RNA produced
38
what can RNA-seq information be used to accurately locate
all introns and exons of even complex genes
39
**true/false** RNA-seq identifies noncoding RNAs produced by a genome
true
40
what is the main reason for why we only know the approx. number of genes in the human genome
The existence of the many noncoding RNAs and our relative ignorance of their function
41
we know from __________ that many organisms share the same basic set of proteins
comparative genomics
42
**true/false** the functions of few identified proteins remain unknown
- **False** - a very large number are unknown
43
approximately how many proteins encoded by a sequenced genome do not clearly resemble any protein that has been studied biochemically
approx. one third
44
what is a key limitation regarding the emerging field of genomics
- comparative analysis of genomes reveals a great deal of information about the relationships between genes and organisms - BUT it often does not provide immediate information about how these genes function or what roles they have in the physiology of an organism
45