Meta-Omics Flashcards
(39 cards)
What is amplicon sequencing (16S gene) for?
What is shotgun sequencing for?
Amplicon - Metataxonomics
Shotgun - Metagenomics
What was done before high-throughput sequencing to do microbiome research?
Do 16S PCR, run DNA product across gradients and it would migrate out to form bands
Separation of genes with mutations
- Diversity
What did scientists do when they sailed around the world?
Sampled the oceans for microbial DNA for shotgun sequencing
Drastically changed the number of proteins in genomic databanks
How did marine microbiome differ to soil microbiome research?
Marine microbiology – Frontier of microbial ecology and technology; Using metagenomics approach
Soil microbiology – Lagged behind due to the complexity/diversity of soil microbiome and contaminating substances
What advancements allowed for the rise of metagenomics?
Prices of sequencing dropped massively
Computational power, bioinformatics, sequencing technology all improving
Amplicon vs Metagenomic Sequencing (hint - single vs all)
Amplicon involves sequencing several copies of reads from 1 target gene in a mix of many fragments
- Use primers to do PCR of gene of interest
Metagenomics is to do with sequencing short sequence from all the DNA in an environmental sample
What is meant by “no genomic context” as a disadvantage for amplicon sequencing?
Certain genes move around on plasmids and mobile genetic elements; Not part of chromosomes
If you sequence these out of the environment, you have no genomic context for where these things are
These genes move around via horizontal gene transfer; Sequence phylogenies won’t match organisms they’re coming out of
Disadvantages of metagenomics? (2 main ones)
More computational power
Mis-annotation of functional genes; Assigns a gene a function which is incorrect
Amplicons sequencing uses degenerate primers. What are these?
Degenerate primers target regions of high conservation like active sites or specific folding regions; Regions encoding function
Within this region there is a degree of variability; Degenerate primers are mixtures of similar primer sequences that take this variability in specific regions into account
What are the 2 problems with amplicon sequencing? (hint - primers)
Primer bias; Primers may only be biasing towards 50,000 and not picking up the other 50,000
Need to know what you’re looking for so you can develop specific primers
What is the top-down approach for metagenomics?
Not looking at specific function
Sequence everything and see where differences are
Then generate general hypotheses
This method generates lots of new data
What is the bottom-up approach for metagenomics?
Work out novel function for a novel protein
If it’s in an environmental bacteria, you want to know what this protein is doing in the environment, its distribution, whether it associates with any environmental niches etc.
Can recycle available data in public databases
What can primer bias vs shotgun approaches give?
Variations in data with areas of overlap but also areas of abundance where they are more enriched in either metagenome or 16S
What was TARA Oceans and what meta- methods did it utilise?
What did all this data help us do?
Huge sampling effort of the oceans using metagenomic (DNA) and metatranscriptomics (RNA) data
Allowed us to understand how the oceans are working oin a molecular level
What did the metagenomic data from TARA lead to?
Creation of a freely available web tool; Ocean gene atlas
What is the method used for assembling and quantifying gene abundance? (hint - 2 levels of assembly; Inter- and intra-site)
Assemble contigs and identify genes and then take all the small reads that made up the contig and map them back to the contig
- The more reads assigned to a contig/specific gene, the more abundant that contig/gene is in environment; Generates abundance profiles
Then for each site, you can look for a specific gene and see which site has more reads mapped to it to see its abundance at a site
- Can also compare different genes within 1 site and see which gene is more abundant
How can the method/algorithm for database searching impact conclusions?
Can show show differing results e.g. One gene being more distributed or abundant than other when they other method finds the opposite
Explain Basic Local Alignment Search Tool (BLAST) (hint - E)
Set a stringency (cut-off); Expect (E) value – Lower E means better match
Position by position comparison
- Does 1:1 matching of query protein sequence
Insertions and deletions reduce score (not E)
Explain Profile Hidden Markov Modelling (pHMM) in relation to BLAST
Using pHMM more PhoA hits were found than using BLAST. What does this say about BLAST? (hint - false)
Takes regions of conservation of protein into account; Can be more confident
Same stringency as BLAST search (e-60)
BLAST alone is not sufficient as it doesn’t fully uncover diversity
- False positives and/or false negatives
Functional genes for P cycling? (4 genes)
phoX
phoD
phoA
glpQ
How does abundance of PhoX and other phylogeny differ across regions and sites?
Their abundance varies across regions and sites
Explain STrain Resolution ON assembly Graphs (STRONG)?
Assembly method allowing distinguishing of different strains
- Relies on building contigs
- Iterative approach
Get short reads from sequencing and co-assemble reads via algorithms to make contigs; Then map them to build up the genome and resolve different strain genomes
How do phosphonates differ to most organic phosphorous bonds?
Phosphonate bonds are C-P which are stronger so they need specialised enzymes to be broken down
What molecules can C-P lyase break down?
What does it have associated? (hint - transporter)
What is meant by it being promiscuous?
Phosphonates (C-P bond)
Has associated ABC transporter
Promiscuous; Works on multiple different substrates (phosphonates)