LEC48: Applications of Next Gen Sequencing Flashcards by Allison Vise

how does sanger sequencing work

take DNA, compartmentalize/target regions of interest in genome

generate sequence-specific primers

these modified DNA bases lack 3’-OH on ribose moiety, so any replicating DNA chain into which they’re added by DNA pol will beu nable to be further extended

these terminator bases are added into individual elongating DNA molecules

produces ladder of DNA chains of specific lengths

each chain is tagged by terminator molecule w/ unique flourescent molecule tag that’s detected by reading device

How well did you know this?

Not at all

Perfectly

how does next gen DNA sequencing process work

physically shear DNA, produce fragments

size select fragments for enzymatic attachment of adaptor primers

denature the DNA into single strands, adaptor primers used to capture individual fragments onto a sequencing matrix

amplify these molecules to make library, by PCR

add DNA polymerase + modified bases that aren’t extendable but are tagged w/ a flourescent color, 1 for each base

generates the series of individual rxns that’re located at a specific location on matrix and emit a specific color, allow ID of base added

How well did you know this?

Not at all

Perfectly

what is massively parallel sequencing

once have done 1 round of next gen sequencing,

wash, removing all reagents

chemically modify, remove flourescent tags, DNA can be elongated again

add second reagents cycle

generates a 2nd piece of seuqnece info for each position

repeat this cycle multiple times, generate sequence data in parallel at massive scale

How well did you know this?

Not at all

Perfectly

how are clusters visualized w/ next gen

measure dNTP-specific flourescence at individual rxn centers b/c each piece of light represnts an individual sequencing rxn

this is advanced microscopy

tihs requires lots of computational power

How well did you know this?

Not at all

Perfectly

once use next gen sequencing to sequence, how do you apply that info

massively parallel sequence data analysis: probabilistic approach, align fragments against reference genome and find variance w/in that fragment of DNA

How well did you know this?

Not at all

Perfectly

how can computer faster analyze next gen genome

1) split genome by chromosome, create many jobs
2) run jobs concurrently on diff cluster nodes
3) combine results into single output for further analysis

How well did you know this?

Not at all

Perfectly

what is the computational challenege re: input of next gen sequencing

input n bp long sequences from sample, as short reads

map those back to reference genome to align them, map out genome

can map your reference back to specific range

but **repetitive regions are a problem for this **b/c hard to unambiguously assign to reference genome

How well did you know this?

Not at all

Perfectly

what are limitations to computational abitilies of next gen

1) space needed is massive for storing image files and subsequent data
2) processing power needed for aligning huge number of relatively short sequence fragments (reads) thatre generated in order to ID positions w/ sequence variatns (polymorphisms, mutations)

need **high performance computer assays, sophisticated computational algorithms **to minimize the processing time needed to accomplish these tasks

How well did you know this?

Not at all

Perfectly

sequence alignment vs databse mapping?

sequence alignment: comparing 2 sequences of DNA

database mapping: comparing many small sequences to one really big sequence

How well did you know this?

Not at all

Perfectly

what happens w/ DNA fragment sequence once generated in next gen seq

must align sequence unambiguously to a specific chromosomal position

use coputer to generate best fits of each fragment to a genome region

highly repetitive regions are difficult to align well, arent sequenced this way

also can not align regions of genome which arent efficienctly amplified by PCR b/c of sequence identity (i.e. high GC content)

How well did you know this?

Not at all

Perfectly

what is the significance of variants in next gen sequencing?

if sample fragments have variation from reference genome, could be incorrect seq assignment, poor alignment, experimental noise if in region w/ few seq reads

HOWEVER if true variation, hard to interpret b/c of:

1) incomplete knowledge of fxn of all genes
2) incomplete knowledge of range of tolerated variation in human populations
3) incomplete knowledge of effect of individual amino acid changes on protein fxn
4) incorrect assignments of pathogenicity in current mutation databases

How well did you know this?

Not at all

Perfectly

what is the exome

protein coding portion of the genome

better understood than regulatory regions of genome that’re noncoding

How well did you know this?

Not at all

Perfectly

why might do exome sequnecing?

to reduct amount of variation to be interpreted in clinical seq sample for next gen seq

can capture **specific DNA fragments representing the coding part of the genome, using specially designed primers that incorporate tag molecules **

How well did you know this?

Not at all

Perfectly

how does exome sequencing work

what can it be used on?

primers confer specificity to your target

primers incorporate tag molecule and use its physical characteristics to yield **enrichment in DNA fragmnets of interest **

cna capsure: whole exome, subset of known disease-associated genes (medical exome), or panel of genes for a specific condition (i.e. epilepsy)

How well did you know this?

Not at all

Perfectly

efficacy of exome sequencing?

v helpful for clinical test especially in undiagnosed, mendelian disease - good for rare disease detection

has been used at Baylor CoM, 25% hit rate on first 250, exomes done

How well did you know this?

Not at all

Perfectly

what are the possible applications of next gen seq?

Study These Flashcards

1) transcription profiling
2) undiagnosed diseases
3) cancer
4) infectious diseases

what info does NGS transcriptional profiling provide?

Study These Flashcards

1) tissue-specific mRNA abundance, expression across tissues or pathologic states
2) alternative splicing events in normal and diseased tissue

describe process of NGS for transcriptional profiling

Study These Flashcards

take total RNA

fragment it

create random hexamer primed cDNA

map to reference gene

do gene function analysis

understanding tissue-specific gene expression in healthy & disease states;

observe rare splicing events, better quantitation of expression

how can NGS help w/ undiagnosed disease?

Study These Flashcards

patients who’ve exhausted medical testing and remain undiagnosed

if ID underlying genetic basis of disease, may be beneficial b/c prognostic, diagnostic (genetic counseling in the family), or therapeautic (rarely)

how many variants does WGS produce?

how do we analyze them, how does penetrance complicate analysis?

Study These Flashcards

~4.8 million

must define if they have small effect size, low penetrance, and thus are polymorphic in “normal” population - requires DB w/ large control pop

or if large effect size, high penetrance, and variant is de novo in proband, or inherited from a phenotypically normal parent - presumes variant’s fully penetrant

how are variants studied?

Study These Flashcards

filtration of variants by polymorphic frequency, false positives, inheritance models

what are limitations of NGS technology?

Study These Flashcards

1) false-positive & false-negative variant calls increase w/ size of sequenced target
2) much varaibility among datasets in SNVs, indels, calls
3) sequence-specific limitations- highly GC or AT rich regions don’t amplify well & extended repetitive sequence runs won’t assemble, sequence well

how is NGS used to study cancer

Study These Flashcards

1) study underlying disease biology - demonstrate clonal evolution of relapsed cancer
2) make ttmnt decisions based on ID of specific driver mutations that might be amenable to targeted therapies
3) RNA req and look at epigenome which is helpful for therapy

how is NGS used to study ID

Study These Flashcards

rapid ID of microbial species from epidemic outbreaks, i.e. Haitian cholera outbreak after the earthquake

reconstructed phylogenetic relationships among strains of pathogen

can sequence a CSF sample, identify organism

how can genomic data be integrated into clinical practice?

personalized medicine apply genetic data in clinical practice - use whole exome data to ID individuals who carry genomic variants that confer specific disease susceptibilities more possible as prices for genomic sequencing decline

Does NGS provide a comprehensive look at the entire genome? Why or why not?

Yes, you fragment the entire genome for NGS whereas w/ Sanger technique, target a part of the genome for study Here use special DNA adaptors that amplify the entire genome for study

Would NGS provide data on mitochondrial DNA mutations?

No because mitochondria has its own genome

How is quantitation of gene expression accomplished by NGS?

Massive computing Aligns fragments found with reference genomes of the population Calculate frequency variance between patient and model reference genome

Which types of mutations are most likely to be unambiguously associated w/ clinical disease states?

Changes such as splicing changes or mutations in the exome or on RNA analysis, in coding regions ## Footnote

List 4 challenges associated with classifying variants as pathogenic or benign

1. the reference genomes we have are incomplete 2. there can be computing errors that may be variants but may be misreads by the computer 3. variants are not necessarily pathogenic they could just be variants and we have more info on some populations and subpopulations than others making this especially a challenge in understudied populations 4. cannot know if variant is de novo in the proband or inherited from a phenotypically normal parent 5. penetrance hard to classify w/ NGS

LEC48: Applications of Next Gen Sequencing Flashcards

(30 cards)