PRE FI DNA SEQUENCING Flashcards

1
Q

Refers to the ORDER OF THE NUCLEOTIDES in the DNA molecule.

A

DNA SEQUENCE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Applications of DNA sequencing in medical
laboratory:

A

o Detection of mutation
o Typing microorganisms
o Identifying human haplotypes
o Designating polymorphism
o Treatment strategies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SEQUENCING METHODS:
 DIRECT DETERMINATION OF THE ORDER, or sequence of nucleotides in a DNA polymer.
 Most specific and direct method for identifying genetic lesions (mutations)/ polymorphisms.
 Types:
1. Manual sequencing (chemical (Maxam-Gilbert & Sangers sequencing)
2. Automated fluorescent sequencing (dye primer & dye terminator sequencing)

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

A

C. Direct sequencing: manual and automated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

MANUAL SEQUENCING:
 Allan M. Maxam & Walter Gilbert
 Requires a ds/ss version of the DNA region to be sequenced with 1 end radioactively labeled ( 32P)
 Sequencing proceeds in 4 SEPARATE REACTIONS
 Template: LABELED FRAGMENT

A. Chemical (Maxam-Gilbert) Sequencing
B. Dideoxy Chain Termination (Sanger) Sequencing

A

Chemical (Maxam-Gilbert) Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Addition of a _________ =
ssDNA would break at specific nucleotides

A

strong reducing agent (10% piperidine)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chemical (Maxam-Gilbert) Sequencing:
o Sequence = bands
o Lane in which the band appeared = ID of
the nucleotide
o Sequence is read from the ____ to the ______ of the gel

A

BOTTOM (5’ end) to the TOP (3’ end) of the gel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Chemical (Maxam-Gilbert) Sequencing:

Run times of short fragments (up to 50 bp)?

A

1-2 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Chemical (Maxam-Gilbert) Sequencing:

Run times of Long fragments (>150 bp) ?

A

7-8 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

MANUAL SEQUENCING:
 Frederick Sanger
 Uses DIDEOXYNUCLEOTIDES(ddNTPs) to determine the order/sequence of nucleotides in a nucleic acid
 PRIMER complementary to DNA to be sequenced
 Product detection of sequencing:
o Primer is attached at 5’ end to a 32P-
/fluorescent dye-labeled nucleotide
o Incorporate 32p/35S-labeled dNTPs in the
nucleotide sequencing reaction mix
(INTERNAL LABELING)
 ddNTPs are added, terminating the DNA synthesis
(chain termination)
o Lack OH = 5’-3’ phosphodiester bond
cannot be established to incorporate a
subsequent nucleotide.
 Components: Mixed in 4 reaction tubes
1. DNA template (PCR product)
2. Radioactivity-labeled primer
3. Enzyme (DNA polymerase)
4. dNTPs (all 4)
5. Buffer (20mM EDTA, formamide, gel tracking/
loading dyes)
6. Different ddNTPs in each of the 4 tubes

A. Chemical (Maxam-Gilbert) Sequencing
B. Dideoxy Chain Termination (Sanger) Sequencing

A

Dideoxy Chain Termination (Sanger) Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SEQUENCING REACTION of Dideoxy Chain Termination (Sanger) Sequencing?

A

thermal cycler (cycler
sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

 Automated reading of DNA sequence ladder
requires fluorescent dyes (4 distinct colors) to label
primers/ sequencing events
1. Fluorescein
2. Rhodamine
3. Bodipy (4,4-difluoro 4-bora-3a-diaza-s indacene)

 Fluorescent dyes can be distinguished by AUTOMATED SEQUENCERS
 Approaches (to label fragments according to their terminal ddNTP): DYE PRIMER and DYE TERMINATOR SEQUENCING

A

Automated Fluorescent Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

 4 different fluorescent dyes are attached to 4
separate aliquots of the sample.
 Dye molecules are attached to the 5’ end of the primer = 4 versions of the same primer with
different dye labels.
 Products are LABELED AT THE 5’ end using the dye color associated with the ddNTP at the end of the fragment.

DYE PRIMER OR DYE TERMINATOR SEQUENCING?

A

Dye Primer Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

 1 of the 4 fluorescent dyes attached to each of the
ddNTPs.
 All 4 sequencing reactions are performed in the
same tube.
 Products fragments are LABELED AT THE 3’ end.

DYE PRIMER OR DYE TERMINATOR SEQUENCING?

A

Dye Terminator Sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

 4 sets of sequencing products in each reaction are loaded onto a single gel lane/ capillary.
 Fluorescent dye colors distinguish which nucleotide is at the end of each fragment.
 Fluorescent detection equipment yields results as electropherogram.
 Base calling: process of bases ID in a sequence by sequencing software.

A

Automated Electrophoresis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Compares an input sequence with all sequences in a selected
database

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

BLAST Basic Local Alignment Search Tool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Finds gene-coding regions in DNA sequences

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

GRAIL Gene Recognition and Assembly Internet Link

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Rapidly aligns pairs of sequences by sequence patterns rather
than individual nucleotides

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Reads bases from original trace data and recalls the bases,
assigning quality values to each base

A. FASTA FASTQ
FAST-All derived from FAST-P (protein) and
FAST-N (nucleotide) search algorithms
Biological data with quality score
B. BLAST Basic Local Alignment Search Tool
C. Phred
D. GRAIL Gene Recognition and Assembly Internet Link

A

Phred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies single- nucleotide polymorphisms (SNPs) among the traces and assigns a rank indicating how well the trace at a site matches the expected pattern for an SNP

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

A.Polyphred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Uses USER - SUPPLIED and internally computed data quality information to improve accuracy of assembly in the presence of repeats

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

Phrap Phragment Assembly Program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Developed by TIGR as an assembly tool to BUILD A CONSENSUS SEQUENCE from smaller-sequence fragments

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

TIGR - Assembler The Institute for Genomic Research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies sequence features such as flanking vector sequences, restriction sites, and ambiguities

A.Polyphred
B. TIGR Assembler The Institute for Genomic Research
C. Phrap Phragment Assembly Program
D. Factura

A

Factura

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Provides MUTATION and SNP DETECTION and analysis, pathogen
subtyping, allele identification, and sequence confirmation

A.Matchmaker
B. SeqScape
C. Assign

A

SeqScape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Software Programs Used to Analyze and Apply Sequence Data:
- Identifies alleles for haplotyping

A.Matchmaker
B. SeqScape
C. Assign

A

Matchmaker & Assign

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

SEQUENCING METHODS:
 Determines a DNA sequence W/OUT HAVING TO MAKE A SEQUENCING LADDER
 Relies on the generation of light (luminescence) when nucleotides are added to a growing DNA strand.
 No gels, fluorescent dyes, ddNTPs
 Reaction mix components:
1. ssDNA template
2. Sequencing prime
3. Sulfurylase
4. Luciferase
5. Substrates: adenosine-5’-phosphosulfate (APS) and luciferin
6. 1 of the 4 dNTPs is added to predetermined
order of the reaction

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

A

Pyrosequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

SEQUENCING METHODS:
 AKA METHYLATION-SPECIFIC SEQUENCING
 Chain termination sequencing designed to DETECT METHYLATED SEQUENCING CYTOSINE NUCLEOTIDES
 2-4 ug of genomic DNA is cut with restriction enzymes to facilitate denaturation.
 DNA is denatured (97C for 5 mins) and exposed to bisulfate solution (sodium bisulfite, NaOH,
hydroquinone) for 16-20 hours.
o Cytosines are deaminated –> uracil
o 5-methylcytosines are unchanged
o Can be detected by Sanger sequencing/ pyrosequencing

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

A

Bisulfite DNA sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

SEQUENCING METHODS:
 Early approaches: used RNase to cut end-labeled RNA at specific nucleotides
 Other approaches:
o Based on amino acid sequence
o Based on sequencing of its complementary
DNA

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

A

A. RNA sequencing

28
Q

o Based on single-molecule sequencing
technology and virtual terminator nucleotides
 mRNA is captured by immobilized polydT
oligomers (through their polyA tails).
o RNA without polyA tails: initial treatment with polyA polymerase
o 4 reversible dye-labeled nucleotides are
sequentially added.

A

Direct RNA sequencing

29
Q

SEQUENCING METHODS:
 AKA MASSIVE PARALLEL SEQUENCING
 Designed to sequence LARGE NUMBERS OF TEMPLATES carrying millions of bases.
 POWERFUL COMPUTER DATA ASSEMBLY SYSTEMS
(bioinformatics, computer software and support)
are required.
 Require the preparation of a sequencing library
(sets of DNA fragments representing the regions to be sequenced).

A. RNA sequencing
B. Next-generation sequencing
C. Direct sequencing: manual and automated
D. Bisulfite DNA sequencing
E. Pyrosequencing

A

Next-generation sequencing

30
Q

 Collection of genes that have been grouped for testing, enabling simultaneous sequencing of all
genes (2 to >1000 genes).

A

GENE PANELS

31
Q

TYPES OF GENE PANELS:
– target regions of SPECIFIC GENES known to affect treatment response, disease state, or clinical condition.

A. Very large panels (≥3000 genes)
B. Targeted panels
C. “Hot-spot” panels

A

C. “Hot-spot” panels

32
Q

TYPES OF GENE PANELS:
–critical genes in particular diseases (hematological-cancer specific, solid-tumor specific).

A. Very large panels (≥3000 genes)
B. Targeted panels
C. “Hot-spot” panels

A

B. Targeted panels

33
Q

TYPES OF GENE PANELS:
– diagnostic, prognostic, discovery purposes.

A. Very large panels (≥3000 genes)
B. Targeted panels
C. “Hot-spot” panels

A

A. Very large panels (≥3000 genes)

34
Q

collection of DNA library
fragments (100-1000 bp) to be sequenced.

A

Sequencing library

35
Q

SYNTHETIC SHORT dsDNA carrying sequences complementary to a single primer pair, which may contain short sequences that will ID the sample (indexing/bar coding).

A

Adapters

36
Q

The regions to be sequenced are enriched by:
1. Probe hybridization
o Probes: biotinylated oligonucleotides complementary to specific gene regions.
2. Amplification with region-specific primers
(amplicon-based targeted libraries)
o Selected by multiplex PCR with gene- specific primers tailed with binding sites for a secondary primer sets.

A

Targeted Libraries

37
Q

loss of library fragments from the sequenced regions.

A

Allele dropout

38
Q

Sequencing Platforms:
- Indexed libraries (gene panels) are AMPLIFIED USING PRIMERS immobilized on
microparticles (BEADS) in aqueous oil emulsion using ADAPTERS on the library fragments complementary to the immobilized primers.

A. Sequencing by ligation
B. Ion-conductance
C. Nanopore sequencing
D. Reversible dye terminator sequencing

A

B. Ion-conductance

39
Q

Sequencing Platforms:
o Captured/ amplified fragments are HYBRIDIZED to IMMOBILIZED on a SOLID SURFACE (FLOW CELL).
o Labeled nucleotides are applied to the flow
cell and incorporated into growing chains
by DNA polymerase at each polony
location.

A. Sequencing by ligation
B. Ion-conductance
C. Nanopore sequencing
D. Reversible dye terminator sequencing

A

D. Reversible dye terminator sequencing

40
Q

Sequencing Platforms:
o Uses a POOL OF LABELED OLIGONUCLEOTIDES DNA LIGASE to identify the template sequence
through the known probe sequences.

A. Sequencing by ligation
B. Ion-conductance
C. Nanopore sequencing
D. Reversible dye terminator sequencing

A

A. Sequencing by ligation

41
Q

Sequencing Platforms:
o DOES NOT REQUIRE FRAGMENTATION and
amplification of the template DNA.
o Each nucleotide can be identified by a disruption in current as it passes through the
pore.
o Also USED FOR DIRECT RNA SEQUENCING
A. Sequencing by ligation
B. Ion-conductance
C. Nanopore sequencing
D. Reversible dye terminator sequencing

A

C. Nanopore sequencing

42
Q

DATA ANALYSIS:
optical signals are
translated to a nucleotide sequenced

A

BASE CALLING

43
Q

Data Analysis:
Optical signals are translated to a nucleotide
sequence (BASE CALLING ), which is measured by
the ____, acceptable = 2-3 (100-1000-fold
certainty of a correct call).

A

Phred score

44
Q

Data Analysis:
Each sequence is compared to a REFERENCE SEQUENCE
(“normal”) through ___________

A

read alignment

45
Q

based on comparison with the reference
sequence (SNVs, indels, rearrangement
sequences, CNVs).

A

VARIAINT ID

46
Q

Sequence variations from the reference are
arranged in a ______

A

variant call file (VCF)

47
Q

performed for critical variants ID

A

ANNOTATIONS

48
Q

ANNOTATIONS:
 Confidence in the variant call is determined by
_______ and _________

A

sequence quality and coverage = at least 500x
(recommended).

49
Q

Variants that remain after filtering may be
annotated by searching in disease-specific
databases:

A
  1. Cancer Genome Atlas (TCGA)
  2. Catalogue of Somatic Mutations in Cancer
    (COSMIC)
  3. My Cancer Genome
  4. Leiden Open (source) Variation Database
    (LOVD)
  5. Human Genome Mutation Database (HGMD)
50
Q

Involves using computer technology (in silico) to
collect, store, analyze, and disseminate biological data and information (computational biology).

A

BIOINFORMATIICS

51
Q

BIOINFORMATICS TERMINOLOGY:
The extent to which two sequences are the same.

A

Identity

52
Q

BIOINFORMATICS TERMINOLOGY:
- The EXTENT TO WHICH TWO OR MORE SEQUENCES ARE THE SAME .Lining up two or more sequences to
search for the maximal regions of
identity in order to assess the extent of
biological relatedness or homology.

A

Alignment

53
Q

BIOINFORMATICS TERMINOLOGY:
- Alignment of some portion of two sequences.

A

Local alignment

54
Q

BIOINFORMATICS TERMINOLOGY:
- Alignment of THREE or MORE sequences
arranged with gaps so that common residues are aligned together.

A

MULTIPLE SEQUENCE ALIGNMENT

55
Q

BIOINFORMATICS TERMINOLOGY:
- The alignment of two sequences with
the BEST DEGREE OF IDENTITY

A

OPTIMAL ALIGNMENT

56
Q

BIOINFORMATICS TERMINOLOGY:
- Specific sequence changes (usually protein sequence) that maintain the properties of the original sequence.

A

CONSERVATION

57
Q
  • Established by National Institute of Health (NIH)
    by JAMES WATSON
     Primary mission (2.9 million)
  • To decipher the sequence of complete
    human genetic material (entire Genome)
A

HUMAN GENOME PROJECT (HGP)

58
Q

1st complete genome
sequence (1984)

A

Epstein-Barr virus

59
Q

WHO completed the:
o 1st sequence of a free-living organism (Haemophilus influenzae)
o Sequence of the smallest free-living organism (Mycoplasma genitalium)

A

Craig Venter and colleague (Institute Genomic Research)

60
Q

SEQUENCING APPROACH OF THE 2 PROJECTS:
- hierarchical shotgun approach
– to sequence from KNOWN REGIONS

A

NIH METHOD

61
Q

SEQUENCING APPROACH OF THE 2 PROJECTS:
- whole-genome shotgun sequencing
– to sequence RANDOM FRAGMENTS

A

Celera (established by Venter)

62
Q

1st chromosome to be
sequenced completely.

A

Chromosome 21

63
Q

most GC-rich (66%)

A

Chromosome 2

64
Q

fewest GC bp (25%)

A

Chromosome X

65
Q

most gene-rich per unit length (23 genes/ Mbp)

A

Chromosome 19

66
Q

OTHER GENOME OBJECTS:
 Goal: to find BLOCKS of sequences that are
inherited together.
 Revealed >1,000 disease-associated regions of the genome (coronary artery disease and diabetes).

A

Human Haplotype Mapping (HapMap) Project

67
Q

OTHER GENOME OBJECTS:
 Provides a RESOURCE of STRUCTURAL VARIANTS in different populations.
 Over 88 million variants were verified: 84.7 million SNPs, 3.6 million short insertions/
deletions, and 60,000 structural variants.

A

1000 GENOME PROJECT