Lecture 11: genomics Flashcards

1
Q

How big is the human genome?

A

3 billion nucleotides

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between genetics and genomics?

A
  1. Genetics refers to the study of inheritance and the ways that traits of conditions are passed down from one generation to the next
  2. Genomics describes the study of all a person’s DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the different types of gene sequencing methods?

A

1) Sanger sequencing ()
2) Microarrays
3) Illumina DNA-sequencing (
)
4) PacBio Long-read Sequencing
5) Nanopore Long-read sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the process of Sanger sequencing?

A
  1. Gene of interest is cloned into vectors, and then the double-stranded DNA in the vector is converted to single stranded DNA by denaturation with alkali or boiling.
  2. Thermal cycle sequencing is carried out by DNA polymerase using one primer. Involves enzymatic DNA polymerase synthesis of a second strand of DNA, complementary to existing template.
  3. Fluorescent dideoxynucleotides are added at low concentrations, and get randomly incorporated into the new strand. They lack the hydroxyl group so the chain gets terminated. This process creates different lengths of PCR products labelled with terminal fluorescent dNTP.
  4. This process is repeated separately with each of the 4 dideoxy bases (4 concurrent strand synthesis reactions). Thus, 4 separate reactions result in 4 families of terminated strands.
  5. The double stranded DNA can be separated by heating, and the fragments are separated by electrophoresis.
  6. Fluorescence (of each dideoxy base) detected and read by the detector.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the process of microarrays sequencing?

A
  1. DNA is added to a microarray chip that contain probes for hundred of thousands of sequences
  2. Each spot has DNA oligonucleotide probes for either a reference or variant sequence
  3. DNA from an individual will hybridize with these probes if they are identical to the probe. This produces a fluorescent signal which is read

(this is very outdated now)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the process of Illumina sequencing?

A
  1. Library prep: The DNA to be sequenced is randomly sheared, and sequence adaptors are ligated onto both ends of a DNA fragment.
  2. These fragments are added to a flow cell that have DNA sequences that are complementary to the sequences on the flow cell, allowing the DNA fragments to bind to the flow cell surface.
  3. Bridge amplification is performed to generate multiple copies of the same DNA
    - The DNA bends and binds to the sequences on the flow cell and get replicated by DNA polymerase and then separate to give new DNA strands
  4. Paired end sequencing is carried out
    - The incorporated fluorescent is recorded to identify incorporated
    nucleotide
    - DNA is read from both the left and right side (paired end sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is there a need to conduct bridge amplification during illumina sequencing?

A

Done to amplify fluorescent signal during sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the process of PacBio SMRT long read sequencing?

A
  1. The DNA is put into a nano well (just one strand)
  2. The DNA is replicated using fluorescently labelled nucleotides, and each nucleotide added is visualised in real time
  3. Colour signal is converted into ATGC base calls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the process of nanopore long read sequencing?

A
  1. Single stranded DNA is pulled through a nanopore
  2. Different DNA sequences causes a different electrical current profile (the electrical current of the nanopore is perturbed slightly)
  3. Measured electrical current signal is converted into ATGC base calls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the importance of a reference genome?

A

A reference genome allows you to readily identify genetic variations by comparing your sequence to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is a reference genome sequenced?

A
  1. Multiple copies of the same genome is shredded into many random pieces and then each piece is sequenced (by Illumina sequencing)
  2. We then try to piece back the genome by overlapping the DNA sequences with each other(de novo assembly)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What type of DNA variations are there?

A
  1. Single nucleotide variants
  2. Small indels (insertion or deletion)
  3. Copy number alterations (no. of chromosome copies)
  4. Structural variations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of whole exome/targeted sequencing?

A

Protein coding regions represent only 2% of the whole genome, so we can save cost by just sequencing the protein coding regions alone
Whole exome sequencing: exons of all genes captured
Targeted sequencing: exons of a subset of genes captured (fewer number of probes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the process of whole exome/targeted sequencing?

A
  1. Shear proteins
  2. Add oligo primers that bind directly to the DNA of interest (exons)
  3. Wash off the undesired fragments
  4. Sequence the remaining exons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the sources of DNA variation?

A
  1. Germline variation
  2. Somatic mutation
  3. De novo mutation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are germline variations?

A
  1. DNA variation that is found in every single cell of an individual
  2. Passed from our parents to us from the sperm and egg
  3. ~1 germline single-nucleotide variant for every 1000 basepairs
17
Q

What are somatic mutations?

A
  1. DNA mutations that we acquire in a subset of cells in our body
  2. Not present in every single cell of our body
  3. Not passed on to offspring
  4. ~1 somatic mutation per 1 million basepairs
  5. Somatic mutations in specific genes are known to be a cause of cancer
18
Q

What are de novo mutations?

A
  1. A DNA variant that is present in all cells of the child, but is not found in the father and mother
  2. Due to somatic mutation in sperm/egg or in the zygote
  3. ~ 1 de novo variant per 100 million basepairs
19
Q

In DNA methylation, where are the methyl groups added?

A

The 5-carbon of cytosine (to produce 5-methylcytosine)

20
Q

What is the effect of DNA methylation?

A

When the DNA is methylated, transcription factors can’t bind to the sequences, so transcription is downregulated

21
Q

How is bisulfite sequencing used to read DNA methylation?

A
  1. Bisulfite converts unmethylated “C” to Uracil. This is read as a “T” during sequencing
  2. A methylated “C” is unaffected and will still be read as a “C” during
    sequencing –> can see which DNA sequences are methylated
  3. Can also compare this with the reference genome to see which T sequences are actually unmethylated C
22
Q

What are the type of histone modifications and what do they do?

A
  1. H3K27ac: acetylation of lysine 27 of the H3 histone; makes the DNA more accessible, upregulates transcription, gene active
  2. H3K4me3: tri-methylation at the 36th lysine residue of H3 histone; makes DNA less accessible, downegulates transcription, gene inactive
23
Q

What is the process used to analyse histone marks?

A

ChIP-seq
Chromatin Immunoprecipitation, followed by sequencing

24
Q

What is the process of ChIP-seq?

A
  1. Immunoprecipitation: Antibody that binds specifically to histone modifications is used to pull down DNA covered by these DNA histone marks
  2. Pulled-down DNA fragments are then subjected to standard Illumina DNA sequencing
  3. We can tell when these histone marks are by mapping these reads to the reference genome
25
Q

What is the transcriptome?

A

A transcriptome is the full range of messenger RNA, or mRNA, molecules expressed by an organism

26
Q

Why is it beneficial to study the transcriptome of a cell in the context of breast cancer?

A

Breast cancer has many subtypes that have different marker proteins
Sequence the transcriptome of the tumour cell to see what kind of mRNA it expressed to determine the subtype
Helpful in determining clinical outcomes, and in deciding best therapeutic to apply

27
Q

What are the ways in which one can sequence the transcriptome?

A
  1. Bulk RNA sequencing
  2. Single cell RNA sequencing
  3. Spatial transcriptomics
28
Q

What is the process of Bulk RNA sequencing?

A
  1. Total RNA from a cell is extracted - its a mixture of multiple types of RNA
  2. mRNA is enriched using probes that bind to the poly-A-tail of mRNA
  3. mRNA is converted to cDNA using reverse transcriptase as RNA cannot be sequenced directly
  4. cDNA is then sequenced via standard Illumina DNA sequencing
29
Q

What is the process of single-cell RNA sequencing?

A
  1. A single cell and a barcoded bead are encapsulated in a single droplet
  2. Cell breaks up within the droplet
  3. RNA within the cell hybridizes with barcoded DNA oligos on the beads (due to interaction between the polyA tail and oligo T on the beads)
  4. This allows a reverse transcription reaction to take place to convert the RNA to cDNA
    5) DNA is then sequenced, and the barcode sequence is used to determine if the RNA had originated from the same cell.
30
Q

What is the process of spatial transcriptomics?

A
  1. Glass slide contains beads that has barcoded DNA
  2. Tissue sample place on top of glass slide.
  3. DNA barcodes are transferred to tissue sample during RNA-seq library prep
  4. These DNA barcodes are used to determine the X-Y position of each RNA sequence
31
Q

How can we employ gene sequencing for early cancer detection?

A
  1. Take a blood sample: Tumor cells would die and leak DNA into one’s blood. This is observed as cell-free DNA (cfDNA) in the blood.
  2. Detect DNA mutations in well characterized oncogenes
  3. If we see DNA methylation signal that is typically associated with lung tissues in one’s blood (where such signals are unexpected), it could indicate a tumor growth in the lung
32
Q

How can we employ gene sequencing for non invasive prenatal testing?

A
  1. Fetal DNA from the placenta leaks into the mother’s blood
  2. The mothers blood is collected for sequencing
  3. Extra fetal DNA would be detected if there is chromosomal anomality (e.g. trisomy 21 for Down Syndrome
33
Q

How can we employ gene sequencing for targeted cancer therapeutics?

A
  1. Tumor samples can be sequenced to determine what somatic mutations are present
  2. A corresponding drug is then given to the patient to target these
    mutations
34
Q

How can we employ gene sequencing for identifying the pathogen responsible for infection

A

Sequence microbes responsible for infection to identify them.

35
Q
A