PRE FI DNA SEQUENCING Flashcards by Sunwoo's Berry

Refers to the ORDER OF THE NUCLEOTIDES in the DNA molecule.

DNA SEQUENCE

How well did you know this?

Not at all

Perfectly

Applications of DNA sequencing in medical
laboratory:

o Detection of mutation
o Typing microorganisms
o Identifying human haplotypes
o Designating polymorphism
o Treatment strategies

How well did you know this?

Not at all

Perfectly

SEQUENCING METHODS:
 DIRECT DETERMINATION OF THE ORDER, or sequence of nucleotides in a DNA polymer.
 Most specific and direct method for identifying genetic lesions (mutations)/ polymorphisms.
 Types:
1. Manual sequencing (chemical (Maxam-Gilbert & Sangers sequencing)
2. Automated fluorescent sequencing (dye primer & dye terminator sequencing)

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

C. Direct sequencing: manual and automated

How well did you know this?

Not at all

Perfectly

MANUAL SEQUENCING:
 Allan M. Maxam & Walter Gilbert
 Requires a ds/ss version of the DNA region to be sequenced with 1 end radioactively labeled ( 32P)
 Sequencing proceeds in 4 SEPARATE REACTIONS
 Template: LABELED FRAGMENT

A. Chemical (Maxam-Gilbert) Sequencing
B. Dideoxy Chain Termination (Sanger) Sequencing

Chemical (Maxam-Gilbert) Sequencing

How well did you know this?

Not at all

Perfectly

Addition of a _________ =
ssDNA would break at specific nucleotides

strong reducing agent (10% piperidine)

How well did you know this?

Not at all

Perfectly

Chemical (Maxam-Gilbert) Sequencing:
o Sequence = bands
o Lane in which the band appeared = ID of
the nucleotide
o Sequence is read from the ____ to the ______ of the gel

BOTTOM (5’ end) to the TOP (3’ end) of the gel

How well did you know this?

Not at all

Perfectly

Chemical (Maxam-Gilbert) Sequencing:

Run times of short fragments (up to 50 bp)?

1-2 hours

How well did you know this?

Not at all

Perfectly

Chemical (Maxam-Gilbert) Sequencing:

Run times of Long fragments (>150 bp) ?

7-8 hours

How well did you know this?

Not at all

Perfectly

MANUAL SEQUENCING:
 Frederick Sanger
 Uses DIDEOXYNUCLEOTIDES(ddNTPs) to determine the order/sequence of nucleotides in a nucleic acid
 PRIMER complementary to DNA to be sequenced
 Product detection of sequencing:
o Primer is attached at 5’ end to a 32P-
/fluorescent dye-labeled nucleotide
o Incorporate 32p/35S-labeled dNTPs in the
nucleotide sequencing reaction mix
(INTERNAL LABELING)
 ddNTPs are added, terminating the DNA synthesis
(chain termination)
o Lack OH = 5’-3’ phosphodiester bond
cannot be established to incorporate a
subsequent nucleotide.
 Components: Mixed in 4 reaction tubes
1. DNA template (PCR product)
2. Radioactivity-labeled primer
3. Enzyme (DNA polymerase)
4. dNTPs (all 4)
5. Buffer (20mM EDTA, formamide, gel tracking/
loading dyes)
6. Different ddNTPs in each of the 4 tubes

A. Chemical (Maxam-Gilbert) Sequencing
B. Dideoxy Chain Termination (Sanger) Sequencing

Dideoxy Chain Termination (Sanger) Sequencing

How well did you know this?

Not at all

Perfectly

SEQUENCING REACTION of Dideoxy Chain Termination (Sanger) Sequencing?

thermal cycler (cycler
sequencing

How well did you know this?

Not at all

Perfectly

 Automated reading of DNA sequence ladder
requires fluorescent dyes (4 distinct colors) to label
primers/ sequencing events
1. Fluorescein
2. Rhodamine
3. Bodipy (4,4-difluoro 4-bora-3a-diaza-s indacene)

 Fluorescent dyes can be distinguished by AUTOMATED SEQUENCERS
 Approaches (to label fragments according to their terminal ddNTP): DYE PRIMER and DYE TERMINATOR SEQUENCING

Automated Fluorescent Sequencing

How well did you know this?

Not at all

Perfectly

 4 different fluorescent dyes are attached to 4
separate aliquots of the sample.
 Dye molecules are attached to the 5’ end of the primer = 4 versions of the same primer with
different dye labels.
 Products are LABELED AT THE 5’ end using the dye color associated with the ddNTP at the end of the fragment.

DYE PRIMER OR DYE TERMINATOR SEQUENCING?

Dye Primer Sequencing

How well did you know this?

Not at all

Perfectly

 1 of the 4 fluorescent dyes attached to each of the
ddNTPs.
 All 4 sequencing reactions are performed in the
same tube.
 Products fragments are LABELED AT THE 3’ end.

DYE PRIMER OR DYE TERMINATOR SEQUENCING?

Dye Terminator Sequencing

How well did you know this?

Not at all

Perfectly

 4 sets of sequencing products in each reaction are loaded onto a single gel lane/ capillary.
 Fluorescent dye colors distinguish which nucleotide is at the end of each fragment.
 Fluorescent detection equipment yields results as electropherogram.
 Base calling: process of bases ID in a sequence by sequencing software.

Automated Electrophoresis

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Compares an input sequence with all sequences in a selected
database

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

BLAST Basic Local Alignment Search Tool

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Finds gene-coding regions in DNA sequences

GRAIL Gene Recognition and Assembly Internet Link

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Rapidly aligns pairs of sequences by sequence patterns rather
than individual nucleotides

FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Reads bases from original trace data and recalls the bases,
assigning quality values to each base

Phred

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies single- nucleotide polymorphisms (SNPs) among the traces and assigns a rank indicating how well the trace at a site matches the expected pattern for an SNP

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A.Polyphred

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Uses USER - SUPPLIED and internally computed data quality information to improve accuracy of assembly in the presence of repeats

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

Phrap Phragment Assembly Program

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Developed by TIGR as an assembly tool to BUILD A CONSENSUS SEQUENCE from smaller-sequence fragments

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

TIGR - Assembler The Institute for Genomic Research

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies sequence features such as flanking vector sequences, restriction sites, and ambiguities

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

Factura

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Provides MUTATION and SNP DETECTION and analysis, pathogen
subtyping, allele identification, and sequence confirmation

A.Matchmaker
B. SeqScape
C. Assign

SeqScape

How well did you know this?

Not at all

Perfectly

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies alleles for haplotyping

A.Matchmaker
B. SeqScape
C. Assign

Matchmaker & Assign

How well did you know this?

Not at all

Perfectly

SEQUENCING METHODS:  Determines a DNA sequence W/OUT HAVING TO MAKE A SEQUENCING LADDER  Relies on the generation of light (luminescence) when nucleotides are added to a growing DNA strand.  No gels, fluorescent dyes, ddNTPs  Reaction mix components: 1. ssDNA template 2. Sequencing prime 3. Sulfurylase 4. Luciferase 5. Substrates: adenosine-5’-phosphosulfate (APS) and luciferin 6. 1 of the 4 dNTPs is added to predetermined order of the reaction A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing

Pyrosequencing

SEQUENCING METHODS:  AKA METHYLATION-SPECIFIC SEQUENCING  Chain termination sequencing designed to DETECT METHYLATED SEQUENCING CYTOSINE NUCLEOTIDES  2-4 ug of genomic DNA is cut with restriction enzymes to facilitate denaturation.  DNA is denatured (97C for 5 mins) and exposed to bisulfate solution (sodium bisulfite, NaOH, hydroquinone) for 16-20 hours. o Cytosines are deaminated --> uracil o 5-methylcytosines are unchanged o Can be detected by Sanger sequencing/ pyrosequencing A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing

Bisulfite DNA sequencing

SEQUENCING METHODS:  Early approaches: used RNase to cut end-labeled RNA at specific nucleotides  Other approaches: o Based on amino acid sequence o Based on sequencing of its complementary DNA A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing

A. RNA sequencing

o Based on single-molecule sequencing technology and virtual terminator nucleotides  mRNA is captured by immobilized polydT oligomers (through their polyA tails). o RNA without polyA tails: initial treatment with polyA polymerase o 4 reversible dye-labeled nucleotides are sequentially added.

Direct RNA sequencing

SEQUENCING METHODS:  AKA MASSIVE PARALLEL SEQUENCING  Designed to sequence LARGE NUMBERS OF TEMPLATES carrying millions of bases.  POWERFUL COMPUTER DATA ASSEMBLY SYSTEMS (bioinformatics, computer software and support) are required.  Require the preparation of a sequencing library (sets of DNA fragments representing the regions to be sequenced). A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing

Next-generation sequencing

 Collection of genes that have been grouped for testing, enabling simultaneous sequencing of all genes (2 to >1000 genes).

GENE PANELS

TYPES OF GENE PANELS: – target regions of SPECIFIC GENES known to affect treatment response, disease state, or clinical condition. A. Very large panels (≥3000 genes) B. Targeted panels C. “Hot-spot” panels

C. “Hot-spot” panels

TYPES OF GENE PANELS: –critical genes in particular diseases (hematological-cancer specific, solid-tumor specific). A. Very large panels (≥3000 genes) B. Targeted panels C. “Hot-spot” panels

B. Targeted panels

TYPES OF GENE PANELS: – diagnostic, prognostic, discovery purposes. A. Very large panels (≥3000 genes) B. Targeted panels C. “Hot-spot” panels

A. Very large panels (≥3000 genes)

collection of DNA library fragments (100-1000 bp) to be sequenced.

Sequencing library

SYNTHETIC SHORT dsDNA carrying sequences complementary to a single primer pair, which may contain short sequences that will ID the sample (indexing/bar coding).

Adapters

The regions to be sequenced are enriched by: 1. Probe hybridization o Probes: biotinylated oligonucleotides complementary to specific gene regions. 2. Amplification with region-specific primers (amplicon-based targeted libraries) o Selected by multiplex PCR with gene- specific primers tailed with binding sites for a secondary primer sets.

Targeted Libraries

loss of library fragments from the sequenced regions.

Allele dropout

Sequencing Platforms: - Indexed libraries (gene panels) are AMPLIFIED USING PRIMERS immobilized on microparticles (BEADS) in aqueous oil emulsion using ADAPTERS on the library fragments complementary to the immobilized primers. A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing

B. Ion-conductance

Sequencing Platforms: o Captured/ amplified fragments are HYBRIDIZED to IMMOBILIZED on a SOLID SURFACE (FLOW CELL). o Labeled nucleotides are applied to the flow cell and incorporated into growing chains by DNA polymerase at each polony location. A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing

D. Reversible dye terminator sequencing

Sequencing Platforms: o Uses a POOL OF LABELED OLIGONUCLEOTIDES DNA LIGASE to identify the template sequence through the known probe sequences. A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing

A. Sequencing by ligation

Sequencing Platforms: o DOES NOT REQUIRE FRAGMENTATION and amplification of the template DNA. o Each nucleotide can be identified by a disruption in current as it passes through the pore. o Also USED FOR DIRECT RNA SEQUENCING A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing

C. Nanopore sequencing

DATA ANALYSIS: optical signals are translated to a nucleotide sequenced

BASE CALLING

Data Analysis: Optical signals are translated to a nucleotide sequence (BASE CALLING ), which is measured by the ____, acceptable = 2-3 (100-1000-fold certainty of a correct call).

Phred score

Data Analysis: Each sequence is compared to a REFERENCE SEQUENCE (“normal”) through ___________

read alignment

based on comparison with the reference sequence (SNVs, indels, rearrangement sequences, CNVs).

VARIAINT ID

Sequence variations from the reference are arranged in a ______

variant call file (VCF)

performed for critical variants ID

ANNOTATIONS

ANNOTATIONS:  Confidence in the variant call is determined by _______ and _________

sequence quality and coverage = at least 500x (recommended).

Variants that remain after filtering may be annotated by searching in disease-specific databases:

1. Cancer Genome Atlas (TCGA) 2. Catalogue of Somatic Mutations in Cancer (COSMIC) 3. My Cancer Genome 4. Leiden Open (source) Variation Database (LOVD) 5. Human Genome Mutation Database (HGMD)

Involves using computer technology (in silico) to collect, store, analyze, and disseminate biological data and information (computational biology).

BIOINFORMATIICS

BIOINFORMATICS TERMINOLOGY: The extent to which two sequences are the same.

Identity

BIOINFORMATICS TERMINOLOGY: - The EXTENT TO WHICH TWO OR MORE SEQUENCES ARE THE SAME .Lining up two or more sequences to search for the maximal regions of identity in order to assess the extent of biological relatedness or homology.

Alignment

BIOINFORMATICS TERMINOLOGY: - Alignment of some portion of two sequences.

Local alignment

BIOINFORMATICS TERMINOLOGY: - Alignment of THREE or MORE sequences arranged with gaps so that common residues are aligned together.

MULTIPLE SEQUENCE ALIGNMENT

BIOINFORMATICS TERMINOLOGY: - The alignment of two sequences with the BEST DEGREE OF IDENTITY

OPTIMAL ALIGNMENT

BIOINFORMATICS TERMINOLOGY: - Specific sequence changes (usually protein sequence) that maintain the properties of the original sequence.

CONSERVATION

- Established by National Institute of Health (NIH) by JAMES WATSON  Primary mission (2.9 million) - To decipher the sequence of complete human genetic material (entire Genome)

HUMAN GENOME PROJECT (HGP)

1st complete genome sequence (1984)

Epstein-Barr virus

WHO completed the: o 1st sequence of a free-living organism (Haemophilus influenzae) o Sequence of the smallest free-living organism (Mycoplasma genitalium)

Craig Venter and colleague (Institute Genomic Research)

SEQUENCING APPROACH OF THE 2 PROJECTS: - hierarchical shotgun approach – to sequence from KNOWN REGIONS

NIH METHOD

SEQUENCING APPROACH OF THE 2 PROJECTS: - whole-genome shotgun sequencing – to sequence RANDOM FRAGMENTS

Celera (established by Venter)

1st chromosome to be sequenced completely.

Chromosome 21

most GC-rich (66%)

Chromosome 2

fewest GC bp (25%)

Chromosome X

most gene-rich per unit length (23 genes/ Mbp)

Chromosome 19

OTHER GENOME OBJECTS:  Goal: to find BLOCKS of sequences that are inherited together.  Revealed >1,000 disease-associated regions of the genome (coronary artery disease and diabetes).

Human Haplotype Mapping (HapMap) Project

OTHER GENOME OBJECTS:  Provides a RESOURCE of STRUCTURAL VARIANTS in different populations.  Over 88 million variants were verified: 84.7 million SNPs, 3.6 million short insertions/ deletions, and 60,000 structural variants.

1000 GENOME PROJECT

PRE FI DNA SEQUENCING Flashcards

(67 cards)