Scientific Flashcards
PCR, Sanger, NGS, MLPA, Flanking PCR, TP-PCR, QF PCR. Advantages, limitations. Alternatives, Main steps (38 cards)
How is DNA extracted?
- Lyse cells
- Denature RNA (enzyme)
- Precipitate and remove proteins (high salt solution)
- Precipitate DNA (100% ethanol precipitates DNA out of solution (salt and alcohol dehydrates DNA)
- Wash DNA (70% ethanol – removes remaining salt
- Hydrate DNA (TE)
Explain Sanger sequencing
•Want to sequence a specific region in the DNA template (extracted from patient blood sample)
• Design primers (forward and reverse) which are complementary to region of interest.
• Primers will have specific annealing temperatures, which enable them to hybridise to DNA
• We can then add ligase, polymerase and deoxynucleotides (bases) to enable amplification of DNA template region
Sanger Sequencing
• Requires
o DNA template (from PCR stage)
o Primer for region of interest (forward or reverse – obtain a sequence for both directions)
o DNA polymerase
o dNTPs (dATP, dGTP, dCTP, dTTP)
o fluorescently labelled ddNTPs (each labelled with different colour to identify which base has been added)
- Step 1: Strand separation (heating to denature DNA)
- Primer annealing (cooler temperature)
- Extension (increase temp to activate polymerase – uses template strand as guide)
- Termination (ddNTPs will be incorporated randomly, however no further nucleotides can be added after these. This creates a pool of different sized fragments with fluorescent labels)
- Amplified pool of fragments of different sizes with different fluorescent labels run through sequencer
- Samples migrate through sequencer (smallest to largest – smaller fragments travel faster)
- Fluorescent dyes are detected by laser and converted to signal on computer
When would Sanger be useful
- Detect familial variant
- Screening smaller genes
- Carrier testing
- Confirming variant identified on NGS
- Gaps in coverage on NGS
Fleximix in lab
*Plate layout determined from excel spreadsheet that pulls information from internal database
- Determines where DNA and primers need to be put in separate racks
- DNA dilutions manual
- Robot performs all other transfers (these are checked)
*Presymptomatics - all manual transfer steps are observed (therefore no need to repeat)
Main Steps in NGS
- Sample Prep (library prep- Nextera Rapid capture. Tagmentation)
- Cluster Generation (bridge amplification of library, hybridised onto flow cell)
- Sequencing (Sequencing by synthesis, incorporation of fluorescent nucleotides)
- Data analysis and bioinformatics (Reads aligned to reference sequence with bioinformatics software – alignment, differences in reads compared to reference called)
What are paired end reads
*Single end: fragment only sequenced from one end
*Paired end- sequenced from both ends
*These paired reads therefore can be matched up, and enable difficult sequences to be clarified
Helps resolve deletions / insertions
Improve assembly of repetitive regions
Common problems associated with NGS? What can affect quality?
- Cluster density poor
- Too much DNA leads to undertagmentation (not enough transposons, therefore fragments are large, >1kb and this leads to inefficient clustering)
- Too little DNA leads to overtagmentation, DNA is overly fragmented into small <200bp fragments, and this is then washed away at clean up step
- Problems with tagmentation can also occur due to contaminants in the DNA, enzymatic inhibitors, prevent tagmentation
- Poor cluster density could read to low number of reads, therefore lower confidence
- For this reason having an accurate quantity of DNA added to NGS reaction is essential and must be quantified using qubit (reliable and accurate compared to nanodrop)
Bioinformatic steps following NGS
Raw reads (FASTQ file) Quality check (FASTQC) Map to ref genome (Novoalign) Post alignment (Remove duplicates) (SAM file) Variant calling BAM file (Binary version of SAM file)
What depths of sequencing would you use for different samples?
Detecting germline variants: 20-30x (based on paper by Illumina in 2008- showed 15x detected majority of Hom variants)
Cancer - tumours require much deeper sequencing (genetically heterogeneous >100x)
What is Phred score
- Sequencing by synthesis, each read is assigned a quality score (phred-like algorithm)
- This provides a measure of the probability that a base has been called incorrectly
- Phred score (Quality) score of 30 = good, indicates 1/1000 chance that a base has been called incorrectly
- Score of 20 = 1/100 chance
- Score of 10 = 1/10 chance
- Phred score derived: Analysing parameters relevant to sequencing chemistry
- Low Q scores can lead to increased false-positive rates
- SBS generates the highest percentage of error-free reads (most Q>30)
- TruSeq is latest version of chemistry
- Optimised for accurate base calling
- HiSeq and MiSeq
What would bad quality NGS data look like
- Poor cluster density
- Low phred scores
- Many false reads / artefacts
What is FASTQC
o File that reports multiple QC metrics
o Can be used to quickly identify common problems with NGS data
o Per base sequence quality (plots Q score of raw sequence in box plot)
o Raw sequence reads produced from sequencer.
o Contains read phred scores for each base (Q score)
What are SAM / BAM files
o Result of mapping FastQ file to reference genome
o Stores alignment of all reads against genome
o BAM file: binary version of SAM file (more compact)
What are the quality measures checked for NGS
• Library quality control
• Libraries checked on bioanalyser (verifies fragment sizes are as expected and there are no contaminating adapter-dimers.
• Sequencing quality control
• FastQC (quality report for each sequencing lane)
• Yield: Number of bases generated in run
• ErrorRate: Percent of bases called incorrectly in one cycle (calculated from reads aligned to Illumina’s control) %Q30 also checks base quality. Error rate increases along length of read
• Q30 = Percent of bases with quality score >30 (illumina approx. 80% >Q30)
• Density of clusters on flow cell: Can help to evaluate low quality data (over//underloading of DNA)
• Phasing:
•
Advantages and disadvantages of WGS, Exome and panel sequencing
- Depth of coverage: Highest for panel sequencing, lowest for WGS
- Data storage: High for WGS, lower for exome and panel
- Exome is 1% genome, but contains >80% genetic variants responsible for disease
- Exome and panel cannot detect larger CNVs, translocations (important in disorders of intellectual disability)
- Deeper coverage more appropriate to detect mosaicism
What are the NGS tests in your lab
MiSeq: TSCP, CRUK, NIPD, Myeloid panel
HiSeq: TSO, NIPT
How does MLPA work?
o DNA heated and denatured
o Hybridisation master mix added (SALSA MLPA probe and buffer)
o Probes target specific region of DNA
o Each probe contains two oligonucleotides (1 recognised by forward primer, and the other recognised by the reverse primer)
o One probe contains a stuffer sequence (length varied so that a range of targets can be amplified and separated in a single experiment)
o Only when both probe oligonucleotides are hybridised to their targets can they be ligated into a complete probe. (Be careful of single exon deletions as this can be due to SNP under primer binding site)
o Bind adjacent target sites
o Bind sample DNA (60-80nt target site)
o Each probe has unique length
o Probes incubate overnight to hybridise to sample DNA
o Ligase master mix added
o Binds left and right probe oligos – bonds the 2 together into 1
o Mismatch of nucleotide between probe and target / absence of region – means won’t be ligated
o Successfully hybridized and ligated probes can be Amplifed by PCR
o All probes exponentially amplified using single set of primers
o F primer is fluorescently labeled
o Cap. Electro: fragments separated based on length / size
o Size standard used to determine length of amplicons
o Size standard has fragments of known length labelled with diff. fluorescent dyes
o Migration of MLPA amplicons compared to migration of size standard
o Determines length of each amplicon
o Once lengths determined, each amplicon peak can be linked to correct probe and therefore quantified
o Normalise data (compares test sample to set of reference samples)
o Calculates ratio for each probe in each sample in comparison to reference
o 2 alleles = Ratio of 1.0
o Het Del = 0.5 etc
Main steps in MLPA (Short version)
Detects deletions / dups
Denaturation, Hybridisation, Ligation, Amplification, Detection
Probes have 2 parts (bind adjacent targets)
Only when target sequence is present can probes hybridise and be successfully ligated
Successfully ligated probes are PCR amplified (contain fluorescent marker for detection and they are different sizes (compare to size standard)
Compare relative differences in patient probes to reference- detect regions of over / under representation
What does high Primer flare on MLPA mean?
o PCR Failure: can be caused by contaminants in DNA inhibiting polymerase or any other PCR failure (tetrad not heating properly)
o Could use less DNA to dilute contaminants
o Perform extra purification step
What does MLPA sloping signify?
o Evaporation during overnight hybridization
o Injection voltage set too high
o Causes shorter fragments to be injected first, causing signal intensity of these fragments to become higher than longer fragments
High DQ peaks on MLPA?
o Not enough DNA in MLPA reaction
Low DD peaks on MLPA
o Denaturation unsuccessful
o Sample DNA denaturation problems causing (part of) the DNA template to be unavailable for the MLPA probes.
o Likely due to high level of contaminants in sample
What are the purpose of M13 tags
M13 tags allow universal sequencing primers to be used.
Primer sequence on end of all of our primers, enable all samples to be sequenced using M13 primers.
Advantages of MLPA
o Multiplex reaction (can investigate multiple targets simultaneously)
o Cost effective
o Large number of samples simultaneously
o Can detect known point mutations by designing probes to sit over known location of variant