Lecture 4: Genomics and Health Part 1 Flashcards

1
Q

What sequencing method was used for the Human Genome Project?

A

Sanger sequencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What percentage of the human genome was sequenced by the Human Genome Project?

A

92%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What sequencing method was used to resolve the remaining 8% of the human genome that wasn’t identified by the HGP? What was this 8% found to be?

A

Long-read next generation sequencing

8% was found to be duplication and repetitive regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The complete human genome was described in which year (1) and by which consortium (2)?

A

(1) 2022

(2) the Telomere-to-Telomere (T2T) consortium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is one of the major pros of NGS over sanger sequencing?

A

Cheaper - thousands reads sequenced at one time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a reference genome and why is it useful?

A

A reference genome is a standard representation of the human genome

Useful as it forms the foundation of studies - provides a common point for genomic loci and provides a template.
Genetic variations can be characterised against the reference genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two main genetic variations that can be investigated using genome sequencing?

A
  1. Single nucleotide polymorphisms (SNPs)
  2. Structural variants (E.g. deletion, insertion, duplication, inversion, translocation, copy number variation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or false: structural variations are more easily detected with short-read sequencing techniques?

A

False: more easily detected with long-read sequencing techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are SNPs and why are they often investigated? (3 points)

A

Single nucleotide substitutions that vary compared to the reference genome

They are easily analysed and identified with genome sequencing

They are present at more than > 1% of the population (common)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between a SNP and a single nucleotide variation (SNV)?

A

SNPs are common at >1% of the population and each individual has roughly 4-5 million SNPs, where as SNVs are less common and not present at >1% of the population. SNVs refer more to the single nucleotide changes that occur with mutation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a haplotype?

A

the arrangement of SNPs on a chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a haplotype block?

A

A block on a single chromosome containing SNPs that are associated and tend to be inherited together because they are close to each other and recombination between these variants are rare resulting in only a few (4-6) alternative haplotypes for each block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false: a disease susceptibility allele and a marker SNP are often in the same haplotype block?

A

True (although the SNP marker is not necessarily the cause of the disease susceptibility)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three main factors to consider when choosing a sequencing technology?

A
  1. Cost
  2. Time (sample prep, run time, sample transport)
  3. Information capture (accuracy, complex variant detection, feature length)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two main types of genetic information that are sequenced?

A
  1. DNA (Genome, exome)
  2. RNA (transcriptome)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does depth mean in terms of sequencing?

A

the number of times a sequencing read covers a specific region of the genome

17
Q

Describe the steps of most DNA sequencing methods
1. preparation
2. sequencing
3. post-sequencing

A
  1. fragmentation of DNA (physical, chemical, enzymatic)
  2. sequence individual fragments (reads)
  3. assemble reads through overlaps and map the reads to the reference genome
18
Q

Why does the read-length of a sequencing method matter?

A
  1. longer reads are easier to assemble
  2. short-read sequences are more accurate
19
Q

How can sequencing costs be reduced and sensitivity increased?

A

Filtering sequencing inputs (E.g. amplicon sequencing using PCR to amplify only sequence of interest (amplicons) and enrichment/depletion such as removing introns)

Microarray

20
Q

When might you want to use amplicon sequencing?

A

When you know the genomic loci you are interested in sequencing, you can amplify these specific regions using PCR and sequence only the amplicons to reduce cost and achieve the same depth.

21
Q

When might you use target enrichment for sequencing methods?

A

Target enrichment allows more targets to be enriched at once compared to amplicon sequencing
- for example: target enrichment of the exons (account for about 85% of known disease-related variants) in exome sequencing to reduce the cost

22
Q

Why might target enrichment/depletion be used in transcriptomics?

A

Poly-A-selection to enrich mRNA (protein coding transcripts)

Ribodepletion to remove the majority of rRNA (can look at tRNA, lncRNA as well as mRNA without large amounts of unnecessary rRNA)

23
Q

Give three examples of hybridisation-based target enrichment methods

GO OVER THIS IN MORE DETAIL

A
  1. Microarray hybridisation
  2. In Solution hybridisation
  3. Molecular Inversion Probes
24
Q

Why might microarray be used in genomics?

A

It is a hybridisation method that can be used for target enrichment as well as quantification of a known DNA sequence in a sample

It is easier to analyse than sequencing data and costs less

Good at detecting copy number variations as can be quantified

25
Q

What are three possible uses of microarray in genomics?

A

Can be used for:
- Array-based comparative genomic hybridisation (aCGH) to detect copy number variations
- single-nucleotide polymorphism
- Transcriptomics

26
Q

What three controls may be used when comparing generic makeup of an individual

A
  1. compare DNA of disease cell (E.g. cancer) with healthy cell of individual
  2. compare DNA of disease individual with unaffected family members (E.g. in paediatric disorders to determine if its de novo mutation or familial genetic disorder)
  3. Compare DNA of individual with control population that doesn’t have that phenotype
27
Q

What is HapMap?

A

HapMap is a human haplotype map that described the chromosome regions with sets of strongly associated SNPs from geographically diverse cohort

28
Q

Why is it important to expand the genome datasets (large-scale multi-omic projects)?

A
  • reduce cost in long run
  • identify common features for different ethinic groups and generate different reference genomes
  • allow more statistically accurate conclusions to be made about SNP associations
  • identify more genomic variations that will improve identification of disease causing genes.
  • multi-omic (E.g. genomic and transcriptomic sequencing) studies allows us to see impact of genomic variation on encoded protein
29
Q

What are genome-wide association studies (GWAS)?

A

Uses SNPs to determine the association between a disease trait and a SNP

30
Q

What are the major challenges in variant discovery and diagnosis

A

Biased data collection:
- ethnicity (reference genome predominantly of European ancestry means there is poorer ability to identify contributing genetic variants in non-European populations)
- Participation bias (recruitment of participants is skewed towards certain demographics - ethnicity, age, gender, socio-economic status, education, etc)

31
Q

The boundaries between haplotype blocks represents hot spots for what?

A

Recombination

32
Q

What is linkage disequilibrium?

A

particular alleles of different SNPs within a haplotype block tend to be inherited together