module 2: reading the genome Flashcards
Who is the father of DNA sequencing?
Frederick Sanger
Frederick Sanger’s first project was in 1953 where he was elucidating the structure of ___________.
He won a nobel prize in chemistry in ______ cause he showed that:
Insulin
1958; proteins have defined patterns of amino acid residues
Who is Robert W. Holley?
Robert W. Holley won a nobel prize in 1968 for deciphering the structure of alanine transfer RNA (tRNA).
*there were first attempts at RNA sequencing in 1960s
How many years did it take researchers to determine the nucleotide sequence of alanine tRNA?
5.5 years!
It took them 3 years to purify 140 kgs of yeast to get 1g of alanine tRNA.
Then, it took them 2.5 years to sequence.
After Sanger joined the Medical Research Council in 1962 and worked with researches such as FRANCIS CRICK, there were two new techniques that transformed the field of sequencing in 1976.
What are they?
Chain Terminator (Sanger and Coulson) - DNA polymerase extends a radioactively labelled primer with ddNTPs and fragments are separated on agarose.
Chemical Cleavage (Maxam and Gilbert) - longer radio-labelled DNA cut into smaller pieces and separated by agarose.
*Sanger sequencing dominated the field.
Sanger sequencing relies on the use of ddNTPs, also known as chain-terminators.
How are ddNTPs different than dNTPs?
ddNTPs are missing OH on the 3’C.
This OH reacts with 5’ phosphate to form a PHOSPHODIESTER bond that links two NTs together.
Missing OH = can’t add NT! Synthesis can’t continue.
How long did it take to sequence one nucleotide before the two new techniques were found?
1 month per nucleotide
Briefly describe how the non-automated sanger sequencing worked.
Four tubes were used, each containing DNA polymerase, dNTPs, templates, and primers.
Distinct ddNTPs were present in these four tubes. These ddNTPs randomly labeled every potential position on the template.
Then, gel full of radio-activity was run and exposed to X-ray film for 24 hours and was developed.
Couple days of work would generate 100-500 base pairs of info.
What is base call?
Identity of bases that we can derive from analyzing either the graph, gel, etc.
Differentiate the migration direction vs the read direction of an agarose gel (Sanger sequencing).
Migration direction: largest fragment to shortest
Read direction: shortest fragment to largest
5’ of base call is the shortest fragment
(T/F) The first DNA genome was sequenced in 1977 and there were improvements occurring in 1986.
True!
*first ever to be sequenced was RNA in 1976.
*improvements done by Leroy Hood including fluorescent ddNTPs.
How is automated sanger sequencing different than non-automated?
- Use of fluorescent ddNTPs instead of radioactive
- Perform all four reactions in the same tube (reduces cost, time, automated)
- DNA fragments separated by CAPILLARY electrophoresis (more precise, automated)
- Reads up to ~1kb/day
(T/F) The Department of Energy was seeking data to protect the genome from the mutagenesis effects of radiation in 1986. Hence, scientists at the NCHGR proposed to sequence the genome in 1988.
True!
National Center for Human Genome Research was lead by Dr. James Watson.
Sequencing the genome was thought to be ________, _______, and ________.
Impractical, impossible, overambitious
What were the three challenges of sequencing the human genome? Describe each briefly.
Challenge #1: Reliability
- traditional gels were providing 100 bp of sequences. we would need to run 30 million gels for 1x coverage!
Challenge #2: Availability
- most clones (template to be sequenced) were randomly derived and didn’t have material for entire genome. need to generate a library of clones that span the entirety of the genome.
Challenge #3: Assembly
- BIGGEST CHALLENGE!
- Have to fragment the entire genomic DNA into millions of pieces and must put them back in the correct order
What is the International Human Genome Sequencing Consortium (HGP)?
What did they propose?
20 research centres from UK, USA, France, Germany, China, Japan, and India came together to form this Consortium.
They proposed to sequence the EUCHROMATIN region of the genome in 15 years with 3 billion dollars.
What were the 5 goals of the HGP Consortium?
- High-resolution genetic map (based on recombinant frequencies)
- Physical maps (based on distances) of all human chromosomes and of the DNA of selected model organisms
- Determination of the complete sequence of human DNA and of the DNA of selected model organisms
- Development of capabilities for collecting, storing, distributing, and analyzing the data produced
- Creation of appropriate technologies necessary to achieve these objectives
Who created The Institute for Genomic Research (TIGR)?
Why?
Craig Venter created TIGR.
He wanted to patent genes at NIH once he developed Expressed Sequence Tag (EST) to identify genes but wasn’t allowed.
What was the faster method of sequencing that Craig Venter developed in TIGR?
Whole genome shotgun sequencing.
Which genomes did HGP consortium sequence in 1996, 1997, and 1998?
1996: Yeast (12 Mb)
1997: E. Coli (4.7 Mb)
1998: C. elegans (97 Mb)
What is Celera Genomics?
Why was it founded?
What did they propose?
Celera Genomics is a “for profit genomics” that was founded by Craig Venter to patent genes.
It was founded because Craig hated the way human genome project was managed. NIH rejected funding for his influenza project and his group was left our of funding to work on the genome project.
Celera Genomics proposed to sequence the human genome within 3 years in 1998!
Celera genomics sequenced which genome in 1999 and what did this do?
They sequenced the D. melanogaster (160Mb) in 1999.
This progress from Celera Genomics pushed the government project to re-double their efforts.
What was the 20th century’s last great scientific contest?
The race to sequence the human genome!
Public vs Private
What is the difference between sequencing DNA and sequencing genomes?
Sequencing DNA: obtaining a sequence of NTs of a gene or a segment but do not know where it belongs in the genome
Sequencing genomes: determining the identity of all 3 billion bps in order of p arm to q arm of all chromosomes. (where does the DNA go?).