Lecture 6 - Establishing genomics platform for crop species Flashcards
Why is comparative genomics important?
- polyploid crops are related to species with simpler genomes
- polyploid crops are hard to work with - complex
- some species with small genomes have been developed as model systems
- lower genetic redundancy adis in the identification of genes
- model species are small and easy to grow
Give some examples of polyploid crops and their related diploid species
- Bead wheat - Brachypodium distachon (wild triticum species)
- Oilseed rape - arabidopsis thaliana (wild brassica species)
- potato (wild solanum species)
- cotton (wild gossypium species)
What are the elementary events of gene evolution?
- vertical descent (speciation) with modification
- gene duplication
- gene loss
- horizontal gene transfer
- fusion/fission/rearrangements
Define homologues
Genes sharing a common origin
Define orthologues
Genes originating from a single ancestoral gene in the last common ancestor of compared genomes
Doesn’t mean the functions are equivilent although they normally are
Deine paralogues
genes related by duplication
can exist in different genomes
What are homeologous genes?
orthologous genes in the same species as a result of recent polyploidy
Describe the genetics of bread wheat using nomenclature from the relationships of genes
Bread wheat has homologous genes inherited from ancestral genomes. Ancestoral species underwent speciation event to give rise to Aegilops and then a further speciation event to give rise to Aegilops speltoides and triticum uratu, then hybridised to form a polyploid Triticum turgidum (diploid) and Aegilops tauschii (monoploid) - these then hybridised to form bread wheat (triticum aestivum (3 genomes A,B,D,))
Describe the genetics of brassica using nomenclature from the relationships of genes
hybridisation of brassica rapa (2n=20) and Brassica oleracae (2n=18) formed Brassica napus (2n=38)
Formed polyploid from the hybridisation of two species and a doubling of chromosomes
A genome of B.rapa and C genome of B.oleracea hybridise to form AACC genome of B.napus (oilseed rape)
What is the range of plant genomes in size?
Arabidposis thaliana: 130 000 000 bp, 14% repetitive, 25 000 genes
Human: 3GB
Barley: 1/3 wheat genome
Hexaploid bread wheat: 17 GB (17 000 000 000bp) 80% repetitive (hard to assemble genomes by looking at which part of the genomes crossover with which - arises from transposon amplification), 90 000 genes (genes not a large component of the genome size, but does mean hard to target by genetic manipulation)
What is the history of plant genome sequencing in plants?
Earliest genome published (arabidopsis) 2000
relatively few genomes sequenced for many years due to the cost of sequencing and problems of assembling repetitive genomes
Big increase in genome sequences in recent years due to the improvements in next generation sequecing
Cost reduced and enabled sequening of more complex genomes
but mostly only of draft quality
Rice and arabidopsis well sequenced
What is the structure of the rice genome?
Oriza satica (spp. japonica cv. Nipponbare)
- 370 Mb finished sequence of around 440Mb
- 26% repetitive
- 37500 genes
- finished to a very high standard
How are the genomes of cereals mostly related?
Mostly colinear
however this isn’t normal
in most, polyploidy and genome rearrangment has occured increasing gene copy number and complicating colineararity studies
How is plant genome evolution shaped?
Cycles of polyploidy and diploidisation
Why is it particularly useful that the rice genome has been mostly sequenced?
Cereal crop genomes align very well when common markers are used that are present across multiple genomes (extensive marker colinearity)
Show a high degree of colinearisation of genomes in even distantly related genomes
- Triticeae, Maize, Sorghum, Sugar cane, Foxtail millet, rice
What is the structure of the arabidopsis genome?
- ancient whole genome duplication event in arabidopsis, shows signs of being derived from a teraploid ancestor when try to align sequences between species
- within the arabidopsis genome get a lot of coliniarity relationships at multiple places in the genome
- shows that most species undergo a more structural rearranement than observed in grass species
- Arabidopsis thaliana (columbia)
- 115Mb genome
- 14% repetitive
- 25000 genes
- finished to a very high standard
What is the structure of the Brassica rapa genome?
- first brassica genome to be sequenced
- Brassica rapa (Chiifu)
- 285Mb finished sequence of around 480Mb (not huge genome)
- 40% repetitive
- 41000 genes
- finished to moderate standard
How can the colinearity between species be illustated?
Colinearity plot
Red dots - where have most closely related sequence between two genomes
Diagonals - regions of colinearity between genomes
Why can brassica species be used as a model for comparative genomics?
In brassica species, have a group of species related to each other
How have different brassica species evolved?
Evolved by polyploidy and hybridisation followed by a period of diploidisation where newly formed polyploid genomes begin to stabilise.
These events can be detected by looking at sequence divergence between species
Arabidopsis come from an ancestral duplication event followed by a long period of diploidisation.
Brassica are related by this ancestral species but went through a genome triplication, divergence then hybridisation for B. Napus.
Had first polyploidy and two additional rounds of polyloidy before diploidisation process.
How does the way crop species such as Brassica and model species such as arabidopsis evolve determine their ability to be used in GE?
Highlights problem with crops - arabidopsis genome is present as mostly duplicated segments, need to K/O both genes to see effect but most are down to a single copy.
Brassica napus has 12 related genome segments, complicated to do functional genomics
What are accessions/incotypes/genetic varients?
Related cultivars where the genomes are slightly different
When might polymorphisms between homolgues be mistaken for allelic varients?
Anytime but especially when sequence redundancy is low
Why is itharder to gentoype polyploid sequences compared to diploid sequnces?
Diploid sequences when genotyping have a bunch of sequences all of one incotype and one with the other, and SNP markers are the base deletions which differ
In polyploid species if have the same locus in one cultivar as in another then there are often complications from the homolog (corresponding gene in the other genome) which contributes to the sequences observed
In addition to alleleic variability also have a lot of differences between homologues between causes confusion as two different types of sequence polymorphism. Inter homolog polyorphism is what don’t want. May be 100 times more abundant than alleleic variation (used for mapping)