Topic D & E. DNA Sequencing... Function Genomics Flashcards
(121 cards)
What are these large regions of chromosomes that maintain homology between grape and poplar termed? Describe an approach you would use to further characterize these regions.
Syntenic Regions
I will look within these highly conserved regions to see what proteins they code for, which would indicate their functionality.
What are the genes connected by lines between grape and poplar known as? Describe a compu- tational approach used to define these types of genes between organisms.
Orthologs
We can use computational approach, like sequence alignment, if two genes are very similar in their sequence together. BlastZ, ClusterW
If there were lines connecting genes within grape what would these genes be known as? Describe a computational approach used to define these types of genes within an organism.
Paralogs
We can use homology search, probably HMM.
Three years after the human genome was declared essentially finished, gaps in the sequence persist. Describe briefly 3 reasons for the remaining gaps in the euchromatic region of the genome. Do you think it is possible with current technology to close the heterochromatic gaps? Why or why not?
tandem repeats, non-uniquely mapping reads, structural variations We need longer reads to close the gaps
What sequence features or genetic properties might be associated with these gaps? How might they be causing the gaps?
repeats: it’s hard to determine how long the repeat region is if you have reads falling within it heterochromatic regions: Hard to actually get the sequence because it does not dissociate well
Acquisition and mapping of fosmid end sequences derived from unrelated individual genomes to the current human reference sequence forms the basis for the human Structural Variation Project. What kinds of important genetic information might one expect to discover from this analysis? Give 3 examples.
CNVs, inversions, translocations and SNP
Whole genome shotgun sequencing strategy:
An approach to genome sequencing where the whole genome is sheared into sequencable fragments, and computationally assembled. All sequencing is done ahead of time using PCR products, to form shotgun libraries of sequence reads.
Clone-by-clone sequencing strategy:
An alternative to WGS where a divide and conquer approach is utilized. First, create genomic libraries of clones immortalized in vectors such as BACs. Ideally you want 5- 10x redundancy of genomic coverage in your libraries. Then form a tiling path by end sequencing clones and aligning overlapping fragments. In so doing, you will be able to quantify gaps where clones lack coverage. You will sequence individual clones along the tiling path and assemble contigs spanning the genome. Finally work on finishing sequence and plugging gaps.
Hybrid sequencing strategies:
A combination of clone by clone and WGS which was used for the mouse and chicken genome projects. Such a compartmentalized shotgun, could for example break the genome up into chromosomes, and then do shotgun sequencing on each chromosome. Probably the best of both worlds, as many genome projects are now adopting a combines approach.
Draft Sequence:
Finished Sequence:
Sequence with an error rate of 10−3 → q=30
Sequence with an error rate of 10−4 → q=40
Segmental Duplications:
> 1kb > 90% similarity
Q-value:
-10log(p) where p = the error rate (or probability of an error)
Mate-pair sequences:
A pair of sequences derived from the two ends of a single clone. An essential component of shot gun sequencing as the distance between the pairs gives spatial information and assists in resolving repeats.
BAC end sequences:
Used to establish mate pairs and construct the tiling path in clone by clonesequencing. mRNA sequences Messemger RNA Eukaryotic transcribed sequences that have been pro- cessed (ie spliced and exported out of the nucleus)
EST sequences:
Expressed Sequence Tags a sequenced piece of cDNA, however may not span the whole cDNA transcript. cDNA library generation uses primers to the poly a tail of the mRNA transcript, and a single sequencing trace is usually performed toward the 5 portion of the gene (all this is done on the complement strand).
STS:
Sequence Tagged Site any sequenced fragment of DNA derived from a library of clones that is placed on the physical map of the genome. Each STS is unique and primers, PCR conditions, and product size are immediately quantifiable and storable in a database. Fundamental to the HGP.
Microsatellites:
tretch of repetitive DNA made up os a variable number of several to one hundread or more tandem repeats of a small number of nucleotides. Ex (AG)n or (CAG)n. Highly polymorphic (in n at least) and heterozygous, and occur around several per hundred kilobases in higher eukaryotes.
SNP
Single Nucleotide Polymorphisms. Useful for mapping phenotype to gene. Highest resolution of polymorphic markers 1/kb
Meiotic Linkage Maps:
Linkage maps based on natural meiotic breaks from homologous recombina- tion.
Radiation Hybrid Maps:
Linkage maps based on induced chromosomal breaks from X-ray irradia- tion. Fragmented chromosomes are then exposed to hamster cell lines and fragments become either incorporated into the hamster chromosomes (via homologous recombination), or segregate as mini chromosomes.
Cytogenetics:
tudy of chromosomes and the related disease states caused by numerical and structural chromosome abnormalities. FISH is especially used in cytogenetics
FISH
Flourescence Insitu Hybridization. Hybridize fluorescent DNA probe on mitotic chromosome at metaphase. Used in ”chromosome painting” where one species chromosomes are labeled and synteny with another species is sought.
BACs
Bacterial Artificial Chromosomes. A system to clone approk 100kb of DNA into bacteria. Clone-based Physical Maps: Assembled genomic sequence base on hierarchical sequencing of clone libraries Contig alignment to chromosomes
Euchromatin
Open active DNA with genes being actively transcribed. Classically associated with acetylation of histones and HATs