PRE FI DNA SEQUENCING Flashcards

(67 cards)

1
Q

Refers to the ORDER OF THE NUCLEOTIDES in the DNA molecule.

A

DNA SEQUENCE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Applications of DNA sequencing in medical
laboratory:

A

o Detection of mutation
o Typing microorganisms
o Identifying human haplotypes
o Designating polymorphism
o Treatment strategies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SEQUENCING METHODS:
 DIRECT DETERMINATION OF THE ORDER, or sequence of nucleotides in a DNA polymer.
 Most specific and direct method for identifying genetic lesions (mutations)/ polymorphisms.
 Types:
1. Manual sequencing (chemical (Maxam-Gilbert & Sangers sequencing)
2. Automated fluorescent sequencing (dye primer & dye terminator sequencing)

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

A

C. Direct sequencing: manual and automated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

MANUAL SEQUENCING:
 Allan M. Maxam & Walter Gilbert
 Requires a ds/ss version of the DNA region to be sequenced with 1 end radioactively labeled ( 32P)
 Sequencing proceeds in 4 SEPARATE REACTIONS
 Template: LABELED FRAGMENT

A. Chemical (Maxam-Gilbert) Sequencing
B. Dideoxy Chain Termination (Sanger) Sequencing

A

Chemical (Maxam-Gilbert) Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Addition of a _________ =
ssDNA would break at specific nucleotides

A

strong reducing agent (10% piperidine)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chemical (Maxam-Gilbert) Sequencing:
o Sequence = bands
o Lane in which the band appeared = ID of
the nucleotide
o Sequence is read from the ____ to the ______ of the gel

A

BOTTOM (5’ end) to the TOP (3’ end) of the gel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Chemical (Maxam-Gilbert) Sequencing:

Run times of short fragments (up to 50 bp)?

A

1-2 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Chemical (Maxam-Gilbert) Sequencing:

Run times of Long fragments (>150 bp) ?

A

7-8 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

MANUAL SEQUENCING:
 Frederick Sanger
 Uses DIDEOXYNUCLEOTIDES(ddNTPs) to determine the order/sequence of nucleotides in a nucleic acid
 PRIMER complementary to DNA to be sequenced
 Product detection of sequencing:
o Primer is attached at 5’ end to a 32P-
/fluorescent dye-labeled nucleotide
o Incorporate 32p/35S-labeled dNTPs in the
nucleotide sequencing reaction mix
(INTERNAL LABELING)
 ddNTPs are added, terminating the DNA synthesis
(chain termination)
o Lack OH = 5’-3’ phosphodiester bond
cannot be established to incorporate a
subsequent nucleotide.
 Components: Mixed in 4 reaction tubes
1. DNA template (PCR product)
2. Radioactivity-labeled primer
3. Enzyme (DNA polymerase)
4. dNTPs (all 4)
5. Buffer (20mM EDTA, formamide, gel tracking/
loading dyes)
6. Different ddNTPs in each of the 4 tubes

A. Chemical (Maxam-Gilbert) Sequencing
B. Dideoxy Chain Termination (Sanger) Sequencing

A

Dideoxy Chain Termination (Sanger) Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SEQUENCING REACTION of Dideoxy Chain Termination (Sanger) Sequencing?

A

thermal cycler (cycler
sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

 Automated reading of DNA sequence ladder
requires fluorescent dyes (4 distinct colors) to label
primers/ sequencing events
1. Fluorescein
2. Rhodamine
3. Bodipy (4,4-difluoro 4-bora-3a-diaza-s indacene)

 Fluorescent dyes can be distinguished by AUTOMATED SEQUENCERS
 Approaches (to label fragments according to their terminal ddNTP): DYE PRIMER and DYE TERMINATOR SEQUENCING

A

Automated Fluorescent Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

 4 different fluorescent dyes are attached to 4
separate aliquots of the sample.
 Dye molecules are attached to the 5’ end of the primer = 4 versions of the same primer with
different dye labels.
 Products are LABELED AT THE 5’ end using the dye color associated with the ddNTP at the end of the fragment.

DYE PRIMER OR DYE TERMINATOR SEQUENCING?

A

Dye Primer Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

 1 of the 4 fluorescent dyes attached to each of the
ddNTPs.
 All 4 sequencing reactions are performed in the
same tube.
 Products fragments are LABELED AT THE 3’ end.

DYE PRIMER OR DYE TERMINATOR SEQUENCING?

A

Dye Terminator Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

 4 sets of sequencing products in each reaction are loaded onto a single gel lane/ capillary.
 Fluorescent dye colors distinguish which nucleotide is at the end of each fragment.
 Fluorescent detection equipment yields results as electropherogram.
 Base calling: process of bases ID in a sequence by sequencing software.

A

Automated Electrophoresis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Compares an input sequence with all sequences in a selected
database

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

BLAST Basic Local Alignment Search Tool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Finds gene-coding regions in DNA sequences

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

GRAIL Gene Recognition and Assembly Internet Link

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Rapidly aligns pairs of sequences by sequence patterns rather
than individual nucleotides

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Reads bases from original trace data and recalls the bases,
assigning quality values to each base

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

Phred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies single- nucleotide polymorphisms (SNPs) among the traces and assigns a rank indicating how well the trace at a site matches the expected pattern for an SNP

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

A.Polyphred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Uses USER - SUPPLIED and internally computed data quality information to improve accuracy of assembly in the presence of repeats

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

Phrap Phragment Assembly Program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Developed by TIGR as an assembly tool to BUILD A CONSENSUS SEQUENCE from smaller-sequence fragments

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

TIGR - Assembler The Institute for Genomic Research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies sequence features such as flanking vector sequences, restriction sites, and ambiguities

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

Factura

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Provides MUTATION and SNP DETECTION and analysis, pathogen
subtyping, allele identification, and sequence confirmation

A.Matchmaker
B. SeqScape
C. Assign

A

SeqScape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies alleles for haplotyping

A.Matchmaker
B. SeqScape
C. Assign

A

Matchmaker & Assign

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
SEQUENCING METHODS:  Determines a DNA sequence W/OUT HAVING TO MAKE A SEQUENCING LADDER  Relies on the generation of light (luminescence) when nucleotides are added to a growing DNA strand.  No gels, fluorescent dyes, ddNTPs  Reaction mix components: 1. ssDNA template 2. Sequencing prime 3. Sulfurylase 4. Luciferase 5. Substrates: adenosine-5’-phosphosulfate (APS) and luciferin 6. 1 of the 4 dNTPs is added to predetermined order of the reaction A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing
Pyrosequencing
26
SEQUENCING METHODS:  AKA METHYLATION-SPECIFIC SEQUENCING  Chain termination sequencing designed to DETECT METHYLATED SEQUENCING CYTOSINE NUCLEOTIDES  2-4 ug of genomic DNA is cut with restriction enzymes to facilitate denaturation.  DNA is denatured (97C for 5 mins) and exposed to bisulfate solution (sodium bisulfite, NaOH, hydroquinone) for 16-20 hours. o Cytosines are deaminated --> uracil o 5-methylcytosines are unchanged o Can be detected by Sanger sequencing/ pyrosequencing A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing
Bisulfite DNA sequencing
27
SEQUENCING METHODS:  Early approaches: used RNase to cut end-labeled RNA at specific nucleotides  Other approaches: o Based on amino acid sequence o Based on sequencing of its complementary DNA A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing
A. RNA sequencing
28
o Based on single-molecule sequencing technology and virtual terminator nucleotides  mRNA is captured by immobilized polydT oligomers (through their polyA tails). o RNA without polyA tails: initial treatment with polyA polymerase o 4 reversible dye-labeled nucleotides are sequentially added.
Direct RNA sequencing
29
SEQUENCING METHODS:  AKA MASSIVE PARALLEL SEQUENCING  Designed to sequence LARGE NUMBERS OF TEMPLATES carrying millions of bases.  POWERFUL COMPUTER DATA ASSEMBLY SYSTEMS (bioinformatics, computer software and support) are required.  Require the preparation of a sequencing library (sets of DNA fragments representing the regions to be sequenced). A. RNA sequencing B. Next-generation sequencing C. Direct sequencing: manual and automated D. Bisulfite DNA sequencing E. Pyrosequencing
Next-generation sequencing
30
 Collection of genes that have been grouped for testing, enabling simultaneous sequencing of all genes (2 to >1000 genes).
GENE PANELS
31
TYPES OF GENE PANELS: – target regions of SPECIFIC GENES known to affect treatment response, disease state, or clinical condition. A. Very large panels (≥3000 genes) B. Targeted panels C. “Hot-spot” panels
C. “Hot-spot” panels
32
TYPES OF GENE PANELS: –critical genes in particular diseases (hematological-cancer specific, solid-tumor specific). A. Very large panels (≥3000 genes) B. Targeted panels C. “Hot-spot” panels
B. Targeted panels
33
TYPES OF GENE PANELS: – diagnostic, prognostic, discovery purposes. A. Very large panels (≥3000 genes) B. Targeted panels C. “Hot-spot” panels
A. Very large panels (≥3000 genes)
34
collection of DNA library fragments (100-1000 bp) to be sequenced.
Sequencing library
35
SYNTHETIC SHORT dsDNA carrying sequences complementary to a single primer pair, which may contain short sequences that will ID the sample (indexing/bar coding).
Adapters
36
The regions to be sequenced are enriched by: 1. Probe hybridization o Probes: biotinylated oligonucleotides complementary to specific gene regions. 2. Amplification with region-specific primers (amplicon-based targeted libraries) o Selected by multiplex PCR with gene- specific primers tailed with binding sites for a secondary primer sets.
Targeted Libraries
37
loss of library fragments from the sequenced regions.
Allele dropout
38
Sequencing Platforms: - Indexed libraries (gene panels) are AMPLIFIED USING PRIMERS immobilized on microparticles (BEADS) in aqueous oil emulsion using ADAPTERS on the library fragments complementary to the immobilized primers. A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing
B. Ion-conductance
39
Sequencing Platforms: o Captured/ amplified fragments are HYBRIDIZED to IMMOBILIZED on a SOLID SURFACE (FLOW CELL). o Labeled nucleotides are applied to the flow cell and incorporated into growing chains by DNA polymerase at each polony location. A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing
D. Reversible dye terminator sequencing
40
Sequencing Platforms: o Uses a POOL OF LABELED OLIGONUCLEOTIDES DNA LIGASE to identify the template sequence through the known probe sequences. A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing
A. Sequencing by ligation
41
Sequencing Platforms: o DOES NOT REQUIRE FRAGMENTATION and amplification of the template DNA. o Each nucleotide can be identified by a disruption in current as it passes through the pore. o Also USED FOR DIRECT RNA SEQUENCING A. Sequencing by ligation B. Ion-conductance C. Nanopore sequencing D. Reversible dye terminator sequencing
C. Nanopore sequencing
42
DATA ANALYSIS: optical signals are translated to a nucleotide sequenced
BASE CALLING
43
Data Analysis: Optical signals are translated to a nucleotide sequence (BASE CALLING ), which is measured by the ____, acceptable = 2-3 (100-1000-fold certainty of a correct call).
Phred score
44
Data Analysis: Each sequence is compared to a REFERENCE SEQUENCE (“normal”) through ___________
read alignment
45
based on comparison with the reference sequence (SNVs, indels, rearrangement sequences, CNVs).
VARIAINT ID
46
Sequence variations from the reference are arranged in a ______
variant call file (VCF)
47
performed for critical variants ID
ANNOTATIONS
48
ANNOTATIONS:  Confidence in the variant call is determined by _______ and _________
sequence quality and coverage = at least 500x (recommended).
49
Variants that remain after filtering may be annotated by searching in disease-specific databases:
1. Cancer Genome Atlas (TCGA) 2. Catalogue of Somatic Mutations in Cancer (COSMIC) 3. My Cancer Genome 4. Leiden Open (source) Variation Database (LOVD) 5. Human Genome Mutation Database (HGMD)
50
Involves using computer technology (in silico) to collect, store, analyze, and disseminate biological data and information (computational biology).
BIOINFORMATIICS
51
BIOINFORMATICS TERMINOLOGY: The extent to which two sequences are the same.
Identity
52
BIOINFORMATICS TERMINOLOGY: - The EXTENT TO WHICH TWO OR MORE SEQUENCES ARE THE SAME .Lining up two or more sequences to search for the maximal regions of identity in order to assess the extent of biological relatedness or homology.
Alignment
53
BIOINFORMATICS TERMINOLOGY: - Alignment of some portion of two sequences.
Local alignment
54
BIOINFORMATICS TERMINOLOGY: - Alignment of THREE or MORE sequences arranged with gaps so that common residues are aligned together.
MULTIPLE SEQUENCE ALIGNMENT
55
BIOINFORMATICS TERMINOLOGY: - The alignment of two sequences with the BEST DEGREE OF IDENTITY
OPTIMAL ALIGNMENT
56
BIOINFORMATICS TERMINOLOGY: - Specific sequence changes (usually protein sequence) that maintain the properties of the original sequence.
CONSERVATION
57
- Established by National Institute of Health (NIH) by JAMES WATSON  Primary mission (2.9 million) - To decipher the sequence of complete human genetic material (entire Genome)
HUMAN GENOME PROJECT (HGP)
58
1st complete genome sequence (1984)
Epstein-Barr virus
59
WHO completed the: o 1st sequence of a free-living organism (Haemophilus influenzae) o Sequence of the smallest free-living organism (Mycoplasma genitalium)
Craig Venter and colleague (Institute Genomic Research)
60
SEQUENCING APPROACH OF THE 2 PROJECTS: - hierarchical shotgun approach – to sequence from KNOWN REGIONS
NIH METHOD
61
SEQUENCING APPROACH OF THE 2 PROJECTS: - whole-genome shotgun sequencing – to sequence RANDOM FRAGMENTS
Celera (established by Venter)
62
1st chromosome to be sequenced completely.
Chromosome 21
63
most GC-rich (66%)
Chromosome 2
64
fewest GC bp (25%)
Chromosome X
65
most gene-rich per unit length (23 genes/ Mbp)
Chromosome 19
66
OTHER GENOME OBJECTS:  Goal: to find BLOCKS of sequences that are inherited together.  Revealed >1,000 disease-associated regions of the genome (coronary artery disease and diabetes).
Human Haplotype Mapping (HapMap) Project
67
OTHER GENOME OBJECTS:  Provides a RESOURCE of STRUCTURAL VARIANTS in different populations.  Over 88 million variants were verified: 84.7 million SNPs, 3.6 million short insertions/ deletions, and 60,000 structural variants.
1000 GENOME PROJECT