Sorefan Flashcards
(54 cards)
What is whole genome sequencing?
- complete genome seq of organism at single time
- inc seq of chromosomal DNA and mito/chloro etc DNA
What are the challenges for genome sequencing?
- NA extraction from cells –> needs high quality and conc
- fragmentation
- sub-fractionation size selection –> to isolate fragments of correct size
- separating indiv molecules
- amplification of signal
- reading signal
- data analysis
What were the 3 phases of human genome project?
- genetic and physical maps of human and mouse, seq yeast and worm
- -> technology dev
- draft seq –> inc many gaps and errors
- finished seq –> fill in gaps and correcting errors
How are genetic maps made?
- analyse genetic distance between genes by measuring recombination freq
- markers rely on variation of seq between parents and individuals
- distance measured in centimorgans
- mostly PCR based, eg. polymorphisms in genes and DNA markers
- linkage map by looking at relative distances of 2 or more polymorphic genes and measuring RFs
- DNA markers superseded phenotypic markers
- DNA based mol markers could be RFLPs
- -> methods to analyse are slow so moved onto using SSLPs as easy to analyse w/ PCR
What are SSLPs?
- simple seq length polymophisms
- repeat regions in genome that vary in length between pops
- usually mini and microsatellite seqs
What are minisatellites?
- repeat units up to 25bp
- not spread evenly around genome, mostly at telomeric regions
- several kb long
- difficult to PCR
What are microsatellites?
- usually di or trinucleotide repeats
- few 100 bases long
- easy to PCR
- 650,000 in genome
Why are genetic maps in humans limited?
- large pops of siblings don’t exist, so limited no. recombination events to study
- recombination events not at random genome positions –. recombination hotspots
How are physical maps created?
- restriction mapping locates relative positions on DNA molecule of recognition seqs for for REs
- FISH = map marker locations by hybridising probe containing marker to intact chromosomes
- STS = map positions of short seqs by PCR
What are the advantages of creating BAC libraries from indiv chromosomes?
- BAC clone library can be used to seq genome
- BACs w/ inserts from each chromosome could be shared across consortium
How is genome sequencing carried out clone by clone?
- extract DNA
- fragment DNA
- -> ideally completely random so no parts missed out
- -> by physical methods = sonication, hydrodynamic shearing, restriction enzymes and transposase
- -> by chemical methods (mostly used to fragment RNA) = heat and divalent cation (Zn and Mg)
- size selection –> gel electrophoesis
- clone 100-200kbp fragments into BAC plasmids to create library
- transformation of bacteria for BACs
- pick indiv colonies and extract vector (each tube has many copies of indiv DNA insert)
How are clones positioned on genetic and physical maps?
- test clones for PCR markers w/ known locations
- BAC end sequencing using Sanger
- -> known seq so can design primer
- -> denature vector and Sanger seq
- -> design primer to reverse strand to seq other direction
- -> end seqs from same insert, so are paired end read
Why are paired end read useful?
- can physically link 1 end of seq w/ another, so can be used to resolve seq gaps
How is it decided which BAC has insert next to insert of interest?
- gen contiguous set of clones
- if any of BACs inc end seq, then insert they contain must be next to it
- test BAC library for end seq from desired vector by PCR
- repeated over and over again until all BACs placed in order on each chromosome
- created contig
Why was shotgun seq of BAC clones needed?
- as BAC end seq leaves most of middle of genome insert to seq
How was shotgun seq BAC clones carried out?
- each BAC clone broken up into 5-10kb fragments
- cloned into diff vector that accepts smaller inserts
- if seq lots of paired end seqs can assemble large fragment (=consensus seq)
How did Celera seq human genome?
- fragmented genome into 2-50kbp fragments
- cloned 2, 10 and 50kbp fragments into plasmids to create library
- assemble reads to create consensus seq and seq contigs
- draft genome had 98% bases
Why did the IHGP use clone by clone instead of whole genome shotgun seq?
- to prove feasible for complex repeat rich genome
- assembly easier and could be performed confidently
- could target gaps for finishing
- better suited to diverse international consortium
What needed to be done to finish the human genome?
- fill in sequencing gaps and physical gaps
Why were gaps present in human genome, and how could these problems be solved?
- cloning bias
- no restriction sites –> use diff RE, use physical or chem fragmentation method
- insert unstable –> use diff vector
How were seq gaps closed?
- paired end seqs align to either side of gap
- if gap < 1kbp = PCR across gap
- if gap > 1kbp = sequential seq along insert
How were physical gaps closed if know order of scaffolds?
- if gap region absent from all gene libraries
- PCR used to amplify genomic DNA spanning gaps and amplified DNA seq directly w/ or w/o cloning into vector
- PCR products over 3kbp hard to amplify
How were physical gaps closed if don’t know order of scaffolds?
How do we know which pairs of primers are adj and will give product?
- try every poss and look for PCR reaction products using gDNA as template
- in singleplex PCR reaction each combo of primers tested w/ genomic DNA as template
- process sped up w/ multiplex PCR, as multiple pairs of primers tested in single PCR tube, so fewer reactions need to be performed
- use algorithm to decide min no. primer combos
Where are repetitive seqs found in genome?
- approx 45% of genome
- mini and microsatellites
- centromeres
- telomeres
- transposons
- duplicated genes