Week 7 Flashcards
What is genomics?
The study of the entire hereditary information in an organism, which is mostly encoded in the genome
What are genomes?
The sequence of all the DNA in a cell
What are subsets and products of the genome?
Mitogenome the mitochondrial genome
Exome all the exons that could potentially be expressed
Transcriptome the expressed genes (expressed exons) in a particular tissue or set of tissues that you are studying
proteome the proteins
Metabolome other metabolic products…
Microbiome the combined microorganisms inhabiting a Particular environment (e.g. your gut), which can be detected using sequencing strategies
What is needed to study genome?
You need its DNA sequence
What species were prioritised for sequencing?
1 - fuzzy or good to eat eg tomatos, rice, cows and dogs
2 - If it belongs to an evolutionarily, scientifically, or economically important species eg ants, bees nematode, mosquito, Arabidopsis and human
What is the number genome sequencing of bacteria?
Thousands of species of bacteria (a few hundred dollars per genome, these days)
What are examples of genome sequencing projects?
Vertebrate genomes project - generate error free genomes of all 66,000 extant vertebrate species
Darwin Tree of life project - gene sequence of all life living in the UK
Earth Biogenome project - gene sequence for all life on earth
What is an overview of the shotgun sequencing method?
Collect the organism and extract a lot of high-quality DNA (long strands)
break into fragments (enzymes, sound)
read the fragments with a high-throughput sequencer (currently Illumina, PacBio and Nanopore machines are dominant)
Piece together the fragments
Recognise the components (annotation)
How large of the DNA fragments created in shotgun sequencing?
De novo genome assembly is piecing together an encyclopedia from 300-500-letter fragments of sentences
What are the cons of shotgun sequencing?
It takes a LOT of work and money to make a ‘finished’ genome from raw fragments. Most published genomes are just tens of thousands of fragments (contigs and scaffolds), but are long enough to read lots of genes
What is the order for shotgun sequencing?
Reads –> Contigs –> Scaffolds –> Chromosomes
What is the pros and cons of long-read sequencing?
Long-read sequencing = less rebuilding needed afterwards!
Better for reading repetitive regions of the genome currently error prone
What is the length of PACBIO and Nanopore reads?
PACBIO - 20-40 kb read lengths
Nanopore - Up to 100 kb read lengths
What are the uses of sequencing genomes?
Genomes themselves are interesting to study
(Try) to find all the genes involved in a phenotype, not just one or two
To reconstruct deep phylogenies (phylogenomics)
Cancer genomics
Inform conservation strategy
What are examples of organisms with variable gene size?
Influenza - 11
E.coli - 4,149
Fruit fly - 14,889
Chicken - 16,736
Human - 22,333
How can genome size vary?
Basic features similar, genome size is highly variable
10,000 fold range between fungi and flowering plants
Number of genes varies much less
What are examples of genome size and number of genes in plants?
Arabidopsis thaliana - ~25,000 genes, 135 Mb genome size
Canopy plant (Paris japonica) - ?? genes, 152,000 Mb
What is a case study about investigating gene diversity?
Three difference identical looking species of fish
1 is a diploid (Corydoras maculifer), 1 is a tetraploid (Corydoras aragu) and 1 is an unknown
Investigated the variation in the immune genes
Does the increased genome size and copy number have difference in immune genes?
What was investigation into genetic diversity of different fish species looking at?
TLR1 and TLR2
PCR amplified 2 toll-like receptor genes
2.5 Kb each
What was the overview of the sampling of the difference fish species?
Polyploid - n = 30
Diploid - n = 23
Sequenced on NextSeq platform
What was the genetic diveristy metric used?
Single Nucleotide Polymorphisms leading to changes in amino acid sequence
What was the genetic diveristy metric used?
Single Nucleotide Polymorphisms leading to changes in amino acid sequence
What is a synonymous SNP?
A SNP that has a change in nucleotide sequence but not in the amino acid sequence
What is a non-synonymous SNP?
A SNP that has a change in nucleotide sequence causing a change in the amino acid sequence