Exam 1 Study Flashcards

(125 cards)

1
Q

1871

A

Friedrich Miescher identified the presence of ‘nuclein’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

1953

A

James Watson and Francis Crick, Rosaland Franklin and Maurice Wilkins, discover the double helix structure of DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

1977

A

Frederick Sanger develops a DNA sequencing technique

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

1983

A

Kary Mullis develops polymerase chain reaction a technique used for amplification of DNA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

1987

A

the term genomics first used in scientific literature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

1990

A

Human Genome Project is launched

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2003

A

Human Genome Project is finishes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

2007

A

Illumina “next-generation” sequencer is available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

haploid cell number of bases

A

3 million

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How much of the mammalian genome is coding

A

2%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Mitochondrial DNA inheritance

A

strictly maternal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Chromatin

A

DNA with a protein scaffolding
DNA is wrapped around histones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Constitutive heterochromatin

A

inactive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Centromeres are used by the cell

A

during cell division to make sure that each daughter cell gets a copy of each chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Centromeres are

A

highly repetitive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Telomeres are located_____ and do what

A

at the ends of chromosomes
protect the ends of the chromosomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Repetitive DNA is

A

Tandem
Interspersed
Segmental duplications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Lines are what percent of the genome

A

17%
around 5-6 Kb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Sines are what percent of the genome

A

11%
<500 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Cytoplasmic genome

A

circular
uniparental inheritance
small compared to nuclear
thousands of copies per cell
heteroplasmy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Segmental duplications

A

low copy repeats
blocks that range from 1 to 400 kb in length
occur at more than one spot in the genome
and typically share a high level of sequence identity
about 5% of the human genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Centromeres are how may bases

A

100s Kb to Mb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Telomeres are how many bases

A

10s Kb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Three parts of DNA or RNA

A

Pentose sugar
nitrogenous base
Phosphate group attached to the 5’ carbon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
DNA uses what sugar RNA uses what sugar
Deoxyribose ribose
25
Purines
Adenine Guanine
26
Pyrimidines
Cytosine Thymine Uracil
27
The phosphate group allows
two nucleotides to be linked creates the stream of information that DNA encodes
28
5' to 3' linkage between a phosphate group of one nucleotide and the 3' carbon of the next nucleotide's sugar
phosphodiester bonds
29
The two ends of the polynucleotide chain are
not the same 5' end-phosphate group attached to the 5' carbon of the pentose sugar 3' end has a hydroxyl group
30
The polynucleotide chain has
polarity 5' to 3' ends
31
A-T bond has how many H bonds
2
32
C-G bond has how many H bonds
3
33
A-U bond is in
RNA
34
Watson and Crick investigated the structure of DNA not by collecting new data but by
using all the available information about chemistry of DNA to construct molecular models
35
DNA Structure-3 main points
double helix strands are antiparallel base complimentary
36
What type of bond is between base pairs
H bonds Weak enough to be broken and then used
37
DNA strands are arranged helically with ___ base pairs between each turn of the helix
10
38
Raw materials of DNA synthesis
Template -single stranded DNA Enzymes -DNA polymerase Raw materials (substrate) -dNTPs Mg2+ ions
39
DNA polymerase does what
catalyzes the formation of phosphodiester bonds joins the 3'-OH group of the last base in the DNA chain to the incoming 5'-phosphate of a dNTP
40
Synthesis is what direction
5' to 3'
41
dNTP is
selected by the DNA polymerase using the opposing base on the template strand
42
Key features of DNA replication in Eukaryotes
occurs in the nucleus during S phase of the cell cycle is initiated by RNA primers Occurs in the 5' to 3' direction semiconservative Initiated at the same time at many points along the chromosome heterochromatin replicates later than does euchromatin
43
All DNA polymerases require a
free 3' OH
44
Gyrase ds breaks to
relieve torsional strain
45
Helicase breaks
H bonds between bases
46
SSB proteins protect
free DNA, prevent secondary structure
47
Packaging of newly replicated DNA
histones must first disassemble to allow DNA synthesis (uses old histones) Synthesis of new histones is coordinated with DNA Synthesis then resembled into new chromosomes
48
1952
Hershey-Chase experiments are carried out by Alfred Hershey and Martha Chase to demonstrate that DNA, rather than protein, carries our genetic information
49
How many years from the identification of nuclein to the demonstration of DNA as the genetic material
81 years
50
How many years from the first sequencing method to HGP start
13 years
51
2001
first draft of the human genome sequence released 3 Gb
52
How many years between the first bacterium to the Human Genome Project
8 years
53
Celera vs Human Genome Project
HGP clone by clone approach Celera whole genome shotgun
54
2007
solexa 1G sequencer is available next generation sequencing
55
Order of bases
base pair kilobase megabase gigabase terabase petabase
56
Moore's law
the number of transistors incorporated in a chip will approximately double every 24 months
57
Number of copies of target
N times 2c
58
Requirements for DNA replication
DNA template DNA polymerase Nucleotides Primers
59
PCR three steps
Denaturation Annealing Extension
60
Keys to PCR success
primer specificity annealing temp Mg++ concentration
61
Limitations of PCR
size base complexity secondary structure
62
Sanger sequencing uses
4 tubes- one for each base
63
ABI sequencing uses
one tube with 4 fluorescent labels
64
Key components needed for transcription
DNA template the raw materials (ribonucleotide triphosphate) transcription apparatus
65
What has to happen to the DNA in order for a gene to be transcribed
uncoiling
66
DNA molecules undergoing transcription exhibit ___
christmas tree-like structures
67
Regulatory regions determine
what, when, where, how much
68
Regulatory promotors are
upstream of core promotor affect the rate of transcription
69
mRNAs have a
5' cap and 3' poly A tail
70
Most eukaryotic organisms have
introns non-coding region of DNA
71
In eukaryotes, intron size and number is related to
organism complexity
72
Introns have
regulatory roles and are longer than exons
73
in order to have collinearity, introns are
spliced out by snRNPs in a splicesome
74
All sequences in DNA that are transcribed into a single RNA molecule
a gene
75
How many bases are needed to distinguish 20 amino acids
4
76
The genetic code is ____ which means it repeats alot
degenerate
77
sequencing is a
tool to be applied to address a question
78
4 basic steps of Illumina Sequencing
1 sample prep 2 cluster generation 3 sequencing 4 data analysis
79
Library fragment size has
downstream implications for analysis
80
Patterned flow cells give
faster scan times due to ordered cluster positions less cluster overlap more clusters
81
Sequence by synthesis
one nucleotide is added at a time
82
Problems with sequence by synthesis
very accurate but dye can be not cleaved off- see both colors then quality degrades the longer it is
83
In Illumina sequencing the dye is
covalently bonded to the base
84
Illumina sequencing is based on
reversible terminator chemistry Sequencing by synthesis
85
Types of color coding for sequencing
4 channel-each nucleotide has its own color 2 channel- uses two colors (Ais green and pink, G has none, T is green, C is pink) 1 channel-will be discussed later
86
Error for Illumina Sequencing
Clusters start to condense less resolution occurs due to physical properties of SBS equality differences SNPs
87
How much can Illumina NovaSeq X sequence
1600 Gb
88
How does Ultima Genomics work
like a dvd or cd wells that are spun around and read by a laser
89
The ability to resolve a repetitive sequence is dependent on
the length of the molecules in your library
90
Long Read Technology
Oxford Nanopore (ONT)-protein nanopores pacific BioSciences (PacBio)-SMRT bionano genomics-optical maps proximity ligation-assembly
91
Nanopore uses and can do how much for what price
biological nanopores 10-20 Gb in <24 hours around 600 dollars
92
nanopores have a diameter that are
in the same scale as many single molecules, including DNA
93
How does nanopore sequencing work
nanopore is embedded in the membrane -current cannot travel through -nanopore creates a hole and the current drives things through the pore -then measure the change in electrical current to determine which nucleotide it is -each nucleotide has a different structure so it creates a different electrical current
94
Nanopore can sequence how much
400 bp/sec
95
nanopore errors
homopolymers
96
Nanopore accuracy
>99%
97
Which type of sequencing can detect base modification
nanopore
98
PacBio uses what type of sequencing
SMRT-single molecule real-time
99
PacBio uses what to do its sequencing
Nano-wells called Zero-mode Wave guidlines polymerase bound to bottom of ZMW Phospholinked nucleotides light from nucleotide cleavage detected as polymerase processes DNA
100
PacBio mean read length
>20 kb with a moderate error rate
101
Process of PacBio
start with high quality double stranded DNA prepare SMRTbell libraries anneal primers and bind DNA polymerase circularized DNA is sequenced in repeated passes the polymerase reads are trimmed of adapters to yield subreads consensus and methylation status are called from subreads
102
PacBio sequencing rate
10bp/second
103
PacBio errors
homopolymers and indels
104
PacBio accuracy
>99%
105
Error types of Illumina Oxford Nanopore PacBio
-SNPs -Homopolymers -Homopolymers and Indels
106
HMW DNA is fluorescently labeled at
known sequence motifs
107
Bionano genomics process
HMW DNA is fluorescently labeled at known sequence motifs DNA is stretched through nanochannels then imaged creates a map of those sequence motifs NOT SEQUENCING
108
HiC
order and orients contigs (set of DNA segments or sequences that overlap in a way that provides a contiguous representation of a genomic region)
109
PacBio characteristics
increasing read lengths+increasing throughput means decreasing cost 20-30 kb per base error rate is 10-15% Most popular long read
110
Oxford Nanopore characteristics
extremely long reads but relatively few-expensive no upper limit on size- huge potential most promising long read in a few years
111
bionano optical maps characteristics
inexpensive significant improvement of genome assembly both short and long reads
112
FASTA has how many parts and what are they
Two 1) > sequencing name 2) sequence
113
FASTQ has how many parts and what are they
1) @sequence name 2) sequence 3) + some other info 4) quality value (phred scale using ascii)
114
FASTA is used when
quality is not needed presents only the sequence itself chromosomes gene structures
115
FASTQ is used
when quality is needed sequence reads
116
Differences between FASTA and FASTQ
quality included in FASTQ using ASCII coded quality value
117
At a given position in a sequence, the base present is either A/C/G/T but we
cannot directly observe that base.
118
The base that is produced from a DNA sequencer is an observation based on some biochemical/physical property that has
error
119
QPhred=
-10 log10 P(error)
120
FastQC is
one of many software tools to evaluate quality it does not actually do any filtering, provides summary metrics and visuals
121
important metrics
base quality adaptor content
122
K-mers
any integer goes for k it is a polymer
123
what are k-mers used for
to make distributions and estimate errors
124
sequence reads are typically how long
150 bp or less