NGS Flashcards
(42 cards)
What five areas need to be considered when assessing quality of NGS
Error rates of technology. Read length. Base calling algorithms. Alignment. Read depth coverage
In NGS what can contribute to error rates
Signal to noise ratio. Cross talk from nearby clusters or beads. Homopolymer counts. Incomplete extension. Position on the read ( worse at beginning or end).
Error rates typically: 1/10th% to several %
How does read length affect NGS quality
Too short a read and they might not be able to align correctly. Longer read lengths provide more information about relative genomic location but cost more. Paired end sequencing help align shorts reads and helps with rearrangements but is more expensive and time consuming.
What do base calling algorithms do in NGS
Identify bases and give them a quality score (phred score) based on noise estimates from image analysis. Can help improve error rates. The higher the phred score the better the quality.
Quality score important for rejecting low quality reads, trimming low quality bases, improving alignment accuracy, determine in consensus sequences
What must you be careful with base calling algorithms in NGS
They can remove real deletions. Therefore have to use special software designed for detecting deletions.
Whys alignment important and what are the issues in NGS
It’s important that alignment algorithms can cope with sequencing errors and real differences. Alignment is more difficult than Sanger due to the short reads. Paired end sequencing contributes to an increase in matched reads. Issues in repetitive regions/ shared homology.
Must produce well calibrated alignment quality values.
What is depth coverage in NGS
Measurement of the number of times a region has been sequenced during a run. Higher number of reads- the higher the data quality.covearge across regions are variable.
Inadequate coverage can result in a false negative result (miss real SNV). >30fold coverage is recommended. If this isn’t reached the nt needs repeating (Sanger).
What’s a FASTQ file
A file that contains all base calls and quality scores.
What’s a BAM file
A map file that enables the bases to be aligned to the reference genome
What’s a VCF file
A text file that contains information about known variants for comparing the patient to reference genome.
Give a basic overview of what needs to be checked for accurate detection of SNV in NGS
1) data must be aligned correctly. 2) alignment quality needs to be checked. 3) coverage of every base needs to be checked (>30x). 4) variant detection is performed. 5) check each base quality (phred) score. 6) check % reads the variant is seen in to determine real vs sequencing error (a threshold must be established).
What must be considered when assessing if a variant call is real in NGS
% times SNV appears in the forward and reverse strands. % times SNV called vs wild type. NTure of SNV (eg in a homopolymer region).
What’s a homopolymer
A stretch/run of the same base eg AAAAAA
Name 3 causes of error in NGS
Base calling errors. Alignment errors. Low coverage.
For a targeted NGS what has to happen before sequencing
An enrichment step. Either PCR based or hybridisation based
Discuss PCR based enrichment technologies for NGS
Requires a small starting amount of DNA. It’s cheap. Products will contain unwanted introns. Originally long range PCR performed. Now multiplexed enrichement kits.
Nextera (Illumina). 1) tagmentation (transposons simultaneously fragment and tag the DNA with adapters). 2) reduced cycle amplification (adds more motifs to fragments).
Fluidigm access array. 1) hybridisation sequence specific binder to DNA (primer contains universal tag sequence which allows binding of …) 2) annealing of barcode primer (contains a capture sequence appropriate for seq tech). 3) final applicants has barcode seq, pt ID, and is tagged for capture
What tags need to be added the the fragments DNA in library prep stage of NGS
Index sequence to ID pt sample. A primer site for the sequencing primer to anneal to. Capture sequence complementary to the sequencing technology for binding to the cell.
Discuss hybridisation enrichment methods for NGS
Based on capture of target regions. Fragmentation of DNA, tagging of DNA, capture of fragments using a RNA or DNA library.
Describe how sure select works
Sure select (Agilent). 1) shear DNA to produce sequence ready DNA. 2) prepare a biotinylated library of 120mer RNA baits of the RoI with adapter MIDs. 3) hybridise together. 4) separate out hybridised regions using sterptavidin beads and magnets. 5) wash beads and disgest RNA. Prep ready for sequencing.
Describe how haloplex works
Agilent. Involves an initial restriction endonuclease step.
1) digest and denature DNA (6 digests using different REs). 2) prepare probe library (biotinylated probe consisting of a universal primer site, a sequencing primer motif, an index for pt ID and sequences corresponding the the ER sites). 3) hybridise probe library to fragmented DNA (probe designed to bind to both ends of the fragmented DNA resulting in circular DNA). 4) purify and ligate (purify using streptavidin and magnets, than close the circular DNA by ligation. 5) amplify enriched fragments (PCR using a universal barcoded primer that amplifying the circular DNA producing linear tagged fragments ready for sequencing.
Describe bridge amplification in the Illumina NGS platform
- careful quantification of the concentration of the library required*
1) the template hybridises to the immobilised adapter region on the flow cell (p7). 2) initial extension results in a ds strand attached to the flow. 3) dsDNA is denatured removing the template DNA - leaves sequence attached to the flow. 4) the sequence then folds over and annuals to the complementary adapter sequence (p5) forming a bridgework. 5) 1st cycle extension results in a dsDNA bridged. 6) 2nd cycle denaturation results in two ssDNA strands (forward and reverse- one attached to p7 and one attached to p5). 7) cycle repeated x35 (folding, annealing, denaturing). 8) cluster is now formed ready for sequencing.
Describe NGS process for Illumina MiSeq, HiSeq, NextSeq
reverse terminator sequencing is carried out:
A) forward strand read, b) indexes are read, c) reverse strand is read.
1) forward strand sequenced first so the reverse strands are cleaved and washed off. 2) 3’ end of strands are chemically blocked (to prevent folding over) and primed. 3) all 4 Flourescently tagged nucleotides are added at once and are provided each cycle. A single nucleotide extension occurs as there’s a blocking group at 3’OH of ribose. 4) all unincorporated nucleotides are washed away. 5) flow cell illuminated and each clusters fluorescent signal recorded. 6) fluorescent group is cleaved from nucleotide. 7) the 3’ OH is unblocked, allowing a further nucleotide to be added. 8) cycle is repeated for every nucleotide added.
Describe emulsion PCR required for the ion torrent NGS platform
The library molecules are clonally amplified onto beads in spheres. Spheres produced using water and oil. Each sphere containers 1 bead, 1 molecule, reagents required for amplification.
Each sphere has probes attached that are complementary to the adapters of the library molecule. The molecular is amplified and attached to the bead.
Describe sequencing using the ion torrent
Emulsion beads are broken and cleaned up and the individual beads are loaded into the sensor wells by centrifugation.
Chip: high density array of micro wells. Beneath each well is an ion-sensitive layer and an ion sensor (pH meter).
1) The nucleotides (non Flourescently labeled) are added in order. 2) incorporation into the chain results in hydrolysis of the nucleotide triphisphate and net release of a H+ ion. 3) release of the H+ ion results in a shift of the pH of the surrounding solution that’s PROPORTIONAL to the number of nucleotides incorporated. (0.02pH units/nt). 4) pH change is detected by a semiconductor sensor , converted into voltage an digitalised.
After each flow of nucleotides, a wash step ensures nucleotides don’t stay in the wells. Due to the small size of the wells diffusion into and out of the wells is at 1/10per sec so there no need for enzymatic removal of reagents.