Session 2 Flashcards

1
Q

How do you calculate odds ratio?

Why is it useful?

A

OR = (number affected with variant/number unaffacted with variant)/(number affected without variant/number unaffected without variant)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is sensitivity?

How do you calculate?

A

Sensitivity is the ability of a test to correctly identify individuals who are affected by a disease, (the true positive rate)

True positive/(true positive+false negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is specificity?

How do you calculate?

A

Specificity is the ability of a test to correctly identify individuals who are not affected by a disease (the true negative rate)

True negative/(true negative+false positive)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is PPV?

How do you calculate?

A

Positive predictive value (PPV)= The proportion of positive tests that are true positives

True positive/(true positive+false positive)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is NPV?

How do you calculate?

A

Negative predictive value (NPV) = The proportion of negative tests that are true negatives

True negative/(true negative+false negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is MLPA?

Outline the principle

A

Multiplex Ligation-dependant Probe Amplification

DNA is hybridised to two probes- Each has universal primer for fragment amplification but one also has stuffer sequence to make fragments different length. Probes bind directly beside each other and a ligase fills the gap. Rounds of PCR then amplify up using the universal primers to make fragments of different sizes corresponding to region of interest. Relative peak heights to controls and reference probes used to detect CNV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is MS-MLPA?

Outline the principle

A

Methylation specific Multiplex Ligation-dependant Probe Amplification

Similar to MLPA mostly - one tube will have MLPA normal to CNV. Other will be treated with a methylation-specific endonuclease after ligation - and the unmethylated DNA will be cut to stop amplification of that fragment. Can then calculate dosage of methylated sequence - to detect UPD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the common file types from bioinformatics pipeline?

A

BCL - raw file prdocued by Illumina sequencer. Has base call per cycle for each tile on cell

FASTQ - text based format of nucleotide sequence and quality

BAM - FASTQ files aligned to reference genome

CRAM- compressed BAM

VCF - most basic Variant calling from BAM

Annotated VCF - VCF with extra useful annotations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is most commonly used to check quality of FASTQ?

A

FASTQC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some things measured by FASTQC?

A

Per Base Sequence Quality score

Per Sequence Quality Scores

Per Base Sequence Content

Per Base GC Content

Per Sequence GC Content

Sequence Length Distribution

Overrepresented Sequences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the basic steps of an NGS pipeline?

A

Demultiplex
Alignment
Variant calling
Annotation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Name an alignment tool

A

BWA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What type of variants can be detected by SR-NGS?

A

SNVs
Indels
CNV (with caller)
Structural (if coverage of breakpoints)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Name a tool for variant calling

A

GATK (better for SNV)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is main problem with Roche sequencing?

A

Variance of signal intensity for a homopolymer length is large, resulting in high error rates in insertion and deletion (indel) calls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Phred Score?

What is considered high quality?

A

Phred scale score for the likelihood that a base has been called correctly.

Phred >30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why are paired reads better than single end?

A

Identify the relative positions of various reads making it easier for resolving structural rearrangements such as gene insertions, deletions, or inversions

Improve the assembly of repetitive regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What should be involved in a Bioinformatic pipeline validation?

A

Assess the pipeline’s output against the truth set eg Genome in a Bottle

Sensitivity should be calculated from at least 10 individuals and be >95%

Data must be collected over 3 independent runs for reproducibility

Confirm ability to detect known variants (all types needed for the testing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is library preparation?

A

Process of fragmenting DNA and adding adapters and idexes needed for sequencing.

For Exome/targeted panel an enrichment step is also required to cpature regions of interest (not WGS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the two main types of enrichment method?

A

Amplicon

Hybridisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does Amplicon enrichment work?

What are benefits and drawbacks?

A

PCR amplification of regions of interest while adding adapters and indexes

Cheaper and faster but preferential amplification leads to non-uniform coverage and bias, can introduce artefacts and cannot be used for CNV analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How does Hybridisation enrichment work?

What are benefits and drawbacks?

A

Fragmentation and adapter/index ligation happens first. Then oligo probes designed to target regions of interest are bound. Beads are used to pull out bound fragments.

Achieves much more uniform coverage and true representation with different fragments. CNV calling is possible. BUT needs more DNA, costs more and has longer prep time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How does Illumina sequencing work?

A

Sequencing by synthesis

Flow cell covered in flow complementary for either end of fragments. Extension of fragment on lawn and round of bridge amplification leads to cluster formation.

Reverse strands are cleaved to leave only forward strands - for read 1. Sequencing primers bind and rounds of nucleotide addition extends the read with fluorescence corresponding to which nucleotide being read. The indexes are then read using index primers by same method.

Reverse strand is remade and forward removed. Process is repeated for read 2.

24
Q

How does ion torrent sequencing work?

What are the disadvantages?

A

Template DNA is bound to beads and enriched - with each beads having its own well. Measures change in Ph caused by release of hydrogen during incorporation of nucleotide.

Relative poor performance at homopolymer regions. Higher rate of sequencing errors.

25
Q

How does Roche 454 sequencing work?

What are the disadvantages?

A

Four nucleotides cyclically added and DNA polymerase releases pyrophosphate which results in chemiluminescent light signal

High reagent cost.
High error rates in homopolymer regions. Low capacity.

26
Q

What are the advantages of WGS?

A

Allows examination of SNVs, indels, SV and CNVs in coding and non-coding regions of the genome

Detection of structural variants

WGS has more reliable sequence coverage

Coverage uniformity

WGS doesn’t suffer from reference bias

Can go back to re-analyse later

27
Q

What is ChIP-Seq?

A

Chromatin immunoprecipitation followed by sequencing

Used for studying transcriptional regulation and epigenetic mechanisms

28
Q

What are the advantages of RNA-Seq?

A

Look at single base changes, splice changes, gene boundaries and expression levels.

29
Q

What is western blotting?

A

Used to detect presence/absence of a protein, compare protein levels, assess purity or estimate relative molecular mass.

30
Q

What is qPCR?

A

Quantitative PCR/ real time PCR

Amplification of DNA is monitored in real time and there is simultaneous amplification, detection and continuous quantification of DNA templates during each PCR cycle using fluoresence. During the exponential phase of the reaction, the amount of product is directly proportional to amount of template.

31
Q

What is RT-qPCR?

A

reverse transcription qPCR using cDNA from RNA samples

32
Q

What are two types of detected chemistry for qPCR?

A

Non-specific fluorescent dyes that intercalate with any dsDNA - small molecules that when free in solution show very little fluorescence, but bound to the minor groove of increasing PCR products dsDNA its fluorescence increases

sequence-specific DNA probes consisting of oligonucleotides that are labelled with a fluorescent reporter - e.g. Taqman. Probe binds region of interest and is displaced by DNA polymerase during PCR which releases fluoresence.

33
Q

When can quantification occur in qPCR?

A

Only during the exponential phase. Above the cycle threshold (number of cycles when fluoresence is above detectable threshold/background) but before the reagents start to become less available and the amount of amplified DNA starts to affect primer binding

34
Q

What two types of quantification are used in qPCR?

A

Absolute quantification (standard curve analysis) - test sample Ct is plotted against the log of the standard concentration of different dilutions

Relative quantification - determine fold-differences in expression levels of the target gene against housekeeper. Removes possible dilution error but requires PCR efficiency for both target and housekeeper to be the same

35
Q

What are some applications for qPCR?

A

MRD in haemonc
Single base mutation detection
SNP Genotyping
Genomic Copy Number Measurement

36
Q

What are some difficulties of using RNA?

A
  • Short half life
  • Specialist extraction kits and reagents required
  • Ultra-clean laboratory areas required
  • Limited expression patterns may mean that the required tissue is not available for analysis
37
Q

What methods can be used for DNA sizing?

A

Agarose gel electrophoresis
Polyacrylamide gel electrophoresis (PAGE) Pulse field gel electrophoresis
Capillary electrophoresis
Nanowire structures
Agilent Bioanalyzer
Southern blotting

38
Q

What is triplet primed PCR?

A

Method for triplet/quad expansion sizing when expansion too large for conventual PCR.

Uses three primers (P1, P3, P4). P1 binds upstream of repeat. P4 binds repeats at different places to make different size fragments. P3 complementary for 5’ of P4 and can be used to amplify those fragments.

39
Q

What is chimeric PCR?

A

Used for HD

Forward is before the expansion. Reverse is “chimeric” 5’ is complementary for sequence post repeat and 3’ is complementary to to 5 CAG repeats. Will bind at end of repeat and amplify fragment accordingly. Can detect expansions up to 101 (+/-1)

40
Q

What is inverse PCR?

Give an example use

A

PCR used to detect inversions. Uses primers which usually face away from each other but in an inversion will face each other and work in PCR

E.g. Haemophilia A - factor VIII intron 22 inversion

41
Q

When is southern blotting useful?

Outline the method

A

Detection of large fragments not amplifiable by PCR and investigating methylation status

Input DNS (large amount)
Restriction digestion to isolate fragment of interest
Gel electrophoresis to separate and then denature to make single stranded
Transfer DNA to a membrane (absorbent material soaks up buffer through the gel and membrane taking the DNA with it - positive charged material)
Fix DNA to membrane (e.g. baking or UV)
Add probe to label fragments
Wash unbound probe
Visualise fragments

42
Q

What is a SNP?

A

DNA sequence change occurring commonly in the population

43
Q

What SNP pattern is seen for normal diploid?

A

3 bands - 1 homozygous B (at 1 B allele frequency ), 1 het A/B (at 0.5 B allele frequency) and 1 homozygous A (at 0 B allele frequency )

44
Q

What SNP pattern is seen for a duplication?

A

Only het SNPs will be affected so that band splits making 4 bands.

Het bands now sit at 0.666 (if B allele duplicated) and 0.333 (if A allele duplicated) B allele frequency

45
Q

What SNP pattern is seen for a duplication?

A

Loss of an allele means will either be hemizygous B or A so loss of middle band at 0.5 B allele frequency

46
Q

What SNP pattern is seen for a mosaic duplication?

A

Increasing mosaicism separates the heterozygous track further until makes 4 bands for a non-mosaic duplication

47
Q

What SNP pattern is seen for a mosaic duplication?

A

Increasing mosaicism separates the heterozygous track further until makes 2 bands for a non-mosaic deletion

48
Q

What does maternal cell contamination look like on SNP array?

A

Muddied allele frequencies but normal copy number by LogR

49
Q

What SNP pattern is seen for a nullisomy?

A

Almost 0 LogR ratio and SNP assignment not in tracks but random and not coloured

50
Q

What types of UPD can SNP array detect?

A

Isodisomy - in full or partial with some heterodisomy

51
Q

What can Copy number neutral LOH indicate?

What can be calculated from this?

A

Consanguinity

Identity by descent

52
Q

What is third generation sequencing?

A

Single molecule sequencing

53
Q

What are the three categories of third generation sequencing?

A

Sequencing by Synthesis

Nanopore

Synthetic long read

54
Q

Outline an example LR sequencing by synthesis method

A

Single molecule real time (SMRT) sequencing (developed by Pacific Biosciences)

  1. Template fragments are processed into circular DNA molecule.
  2. 1 DNA polymerase on bottom of each zero-mode waveguide (ZMW)
  3. Fluorescentt dNTPs added at high concentration then diffuse back up and exit the hole within microseconds. Nucleotides are only in the detection volume of the ZMW for microseconds, resulting in 100 fold reduction of background noise.
  4. When incorporated dTNP held in detection volume for a longer time
  5. Fluorescence detected and fluorophore attached to phosphate is cleaved
  6. Cycle is repeated
55
Q

What are benefits and disadvantages of pacbio LR?

A
  • very fast
  • circular template allows sequenced multiple times to get consensus and less errors
  • Methylation status
  • Limited number of ZMW per cell
  • Cost