Genetics Flashcards

(52 cards)

1
Q

CADD score

A

Combined annotation-dependent depletion score
- predicts pathogenicity (disease-causing potential) of variants/indels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

2 scores for predicting LoF constraint and their cut-offs

A
  • pLI (probability of LoF Intolerant): =/> 0.9
  • LOEUF (LoF observed/expected upper bound fraction): <0.35
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to judge missense constraint on gnomAD

A

Using missense constraint Z-score >3.1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Concept of “constraint” in genetics

A

Constraint describes how tolerant a gene is to genetic variation (different variants), ie a gene with high constraint is intolerant to variation.
E.g. LoF constraints (measured by pLI and LOEUF) and missense constraint (z-score)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Concept of “depletion” and “enrichment” in genetics

A

Depletion/depleted: genetic variant observed as less common or less frequent than the expected value
Enriched: variant more common/over-represented in specific population than expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the pext score

A

Proportion expressed across transcripts score: per base expression pattern across transcripts and exons as well as in tissue of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When is the pext score useful in gnomAD?

A

Gives biological relevance of variant. When given variant is LoF and strong evidence for disease causing. A low pext score (<0.2) suggests variant not biological relevant (as it’s not expressed across the transcripts or across tissues of interest).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a Mendelian disease and some examples

A

Aka monogenic disorders. Caused by mutations in single gene.
Eg. cystic fibrosis, Huntington’s, Sickle Cell, Duchenne’s, Tay-Sach, PKU, Marfan, ADPKD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define Expressivity and Penetrance

A

Expressivity: Severity of the phenotype that develops in patient with the pathogenic variant
Penetrance: the proportion of individuals carrying the pathogenic variant who display a phenotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the “seed sequence” in relation to CRISPR

A

10-12 bps adjacent to the PAM (3’ end of the gRNA) that determines Cas9 specificity
- 1-5 bps = true seed region (from immunoprecipitation and ChIP-seq data - Zhang 2015)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the causes of the LOF pathogenic variants appearing in GnomAD?

A
  • Transcript error
  • Sequencing error
  • Mapping error
  • Last exon
  • Other annotation error
  • Rescue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is homopolymer

A

Homopolymer refers to a stretch of DNA or RNA sequence where only one type of nucleotide is repeated consecutively, eg AAAAAAAAA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a “rescue splice variant”

A

A type of rescue mechanism in which alternative splicing of mRNA mitigates effect of LOF/pathogenic mutation, which preserves function of protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Definition of nonsense-mediated decay

A

Surveillance pathway that reduces errors in gene expression by eliminating mRNA transcripts that contain premature stop codons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Types of mRNA surveillance pathways

A
  1. Nonsense mediated decay (NMD)
  2. Nonstop mediated decay (NSD)
  3. No-go mediated decay (NGD)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does the location of the termination codon (from truncating mutations) in the last exon affect NMD?

A

The location of the last exon-exon junction complex (EJC) relative to stop codon NB. If stop codon downstream or within 50 nucleotides of final EJC, transcript translated normally. If upstream >50 nucleotides, NMD occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why does truncating mutations in the last exon generally not pathogenic

A
  • Not subject to NMD
  • mutations in 3’ UTR region
  • Protein truncation tolerance - critical domains not affected
  • haploinsufficiency tolerance - better toeralted if functional copy still present
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define linkage disequilibrium

A

Tendency of alleles to be transmitted more or less often than expected by chance alone - usually caused by close proximity of genes on the same chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define epistasis

A

Phenomenon in genetics whereby the effect of a gene mutation is dependent on the presence or absence of mutations in one or more other genes, termed modifier genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define heritability

A

The measure of proportion of the phenotypic variance of a population that can be attributed to genetic differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the “missing heritability” question

A
  • clear conclusion from multiple GWAS studies that highly significant hits accounts for small proportion of the heritability of disease
  • amount of heritability explained by GWAS findings much smaller than estimated heritability from family and twin studies
21
Q

Narrow sense heritability (h2)

A

Narrow-sense heritability (h2) is an important genetic parameter that quantifies the proportion of phenotypic variance in a trait attributable to the additive genetic variation generated by ALL causal variants

22
Q

What are the explanations for the missing heritability problem?

A
  1. large number of common variants of small effects not yet discovered
  2. rare variants with large effect sizes not tagged on genotyping arrays
  3. overestimation of h2 (narrow sense heritability) in siblings/families due to environmental factors or epigenetics
23
Q

Define tag SNP

A

A representative SNP in a genomic region with high LD that represent/called a haplotype

24
Define fine-mapping
Process of determining the genetic variant(s), ie causal variant(s), responsible for complex traits, given evidence of association of genomic region with a trait and assuming at least one causal variant exists
25
Types of chromatin annotations
1. Open chromatin regions (indicate regions available for TF binding) 2. histone modifications (highlights enhancer and promotor regions) 3. DNA methylation
26
What are DNAse Hypersensitive Sites (DHS)?
- DHS are regions of DNA that are particularly accessible to cleavage by DNAse I, characterised by lack of nucleosomes. - Indicates regions of high regulatory activity/regulation of gene expression. Correspond to regulatory elements eg promotors, enhancers, silencers etc. - Used in SNP enrichment analysis (a type of chromatin mark)
27
What is ATAC-Seq
Assay for Transposase-Accessible Chromatin with sequencing. Technique used to investigate chromatin accessibility at genome-wide scale. Used to identify areas of open chromatin - ie areas of high regulatory activity (promotor, enhancers and TF binding sites)
28
Steps of ATAC-Seq
1. Transposase tagmentation: uses hyperactive Tn5 transposase enzyme that simultaneously fragments DNA and adds adapters to ends of DNA 2. Selective fragmentation: Tn5 transposase selectively inserts adapters into regions of open chromatin 3. Library preparation: PCR amplification of tagged sequences 4. Sequencing
29
What is pleiotropy
A phenomenon when one gene influences two or more seemingly unrelated phenotypic traits, aka a gene that exhibits multiple phenotypic expression
30
What is allelic heterogeneity
Phenomenon in which different variants (alleles) in the same gene cause the same or similar phenotype
31
Difference between epistasis and allelic heterogeneity (AH)
Epistasis: effect of one gene variant affects (or is dependent of) another gene variant at a DIFFERENT locus AH: different mutations within SAME locus of SAME gene influence the particular trait
32
What are phenocopy conditions?
Variations in phenotype that is caused by environmental conditions and not by genotype
33
Which domains in MYH7 are mostly affected in HCM
Globular head and hinge regions
34
What is the Non-stop Decay pathway and its mechanisms
NSD - targets mRNA (and peptide) for degradation if lacking a proper stop codon (ie the translation keeps going after where the stop codon should've been). 1. Recognition of non-stop mRNAs - ribosome stalls and signals NSD machinery 2. Ski complex (Ski2, Ski3 and Ski8) recruited to stalled ribosome, interacts with exosome (3' to 5' exonuclease activity) 3. Ribosome disassembled (by Pelota-Hbs1) - recycles ribosome 4. Faulty mRNA degraded by exosome assisted by Ski complex 5. Proteosomal degradation after ubiquitination
35
Causes of non-stop mRNAs
Errors in transcription, splicing or premature polyadenylation
36
Causes of ribosome stalling
1. Defective mRNA: - non-stop mRNA - Damaged mRNA - Secondary structures within mRNA (eg hairpins) - Rare codons (due to low availability of corresponding tRNAs) 2. Amino acid deprivation - leading to shortage of charged tRNA 3. Aberrant translation events - misincorporation of amino acids or other errors during translation 4. Protein quality control mechanisms - interaction with faulty nascent polypeptides that do not fold properly
37
Slipped Strand Mispairing
SSM, or Replication Slippage mutation process which occurs during DNA replication. It involves denaturation and displacement of the DNA strands, resulting in mispairing of the complementary bases. Slipped strand mispairing is one explanation for the origin and evolution of repetitive DNA sequences Leads to dinucleotide or trinucleotide repeats At sites of tandem repeats
38
Key mechanisms of ASO action
1: RNA degradation via RNAse H1 - binding of ASO to mRNA = DNA-RNA duplex --> recruits RNase H1 --> cleaves RNA strand = degradation (gapmers) 2. RNA degradation using RNAi - using siRNAs --> recruited into RISC (RNA-induced silencing complex) --> cleaves target mRNA 3: Steric blocking - binding of ASO to mRNA physically blocks access to splicing factors or ribosomes = blocks splicing and/or translation 4. Modulation of splicing (eg exon skipping or exon inclusion): In exon skipping, ASOs designed to target splice sites in pre-mRNA - inhibits binding of spliceosome --> skips over exon (with frameshift mutation eg in Duchenne's) and removes segment from mRNA; in exon inclusion, ASO can sterically block intronic splicing silencer (in SMA)
39
Monocistronic vs polycistronic mRNA
Monocistronic = mRNA translates only 1 single protein chain Polycistronic = multiple ORFs that translate into multiple peptides
40
Examples of gene expression roles of UTRs (untranslated regions) of mRNAs
mRNA stability, mRNA localisation and translational efficiency
41
What are nuclear speckles
Regions in the nucleus associated with pre-mRNA splicing and transcriptional regulation Aka interchromatin granule clusters
42
What is MALAT-1
MALAT-1 = metastasis-associated lung adenocarcinoma transcript 1 - long non-coding RNA lncRNA widely expressed in many tissues with roles in gene expression and splicing etc - localised in nuclear speckles - over-expressed in many CAs (eg lung, breast, liver) - promotes proliferation
43
Key differences between gapmers and mixmers
2 different types of ASOs Gapmers = central DNA gap flanked by modified nucleotides vs Mixmers = alternating modified nucleotide Gapmer = RNase-H mediated RNA degradation vs Mixmer = steric blocking Gapmer = long-lasting vs mixmer = transient
44
What is an operon
Functioning unit of DNA containing a cluster of genes under the control of a single promoter. Genes transcribed together into an mRNA and translated together or spliced into monocitronic mRNAs Common in prokaryotes and rare in eukaryotes
45
Structure of an operon
1. Promoter - nucleotide sequence that enables a gene to be transcribed 2. Operator - DNA segment to which a repressor binds, defined in the lac operon as between the promoter and structural genes 3. Structural genes Others: - repressor protein (coded by regulatory gene) - inducer - displaces repressor
46
Explain the lac operon mechanism
Encoded in E. coli Mechanism in which the bacteria switches on the transcription of enzymes that processes lactose when glucose is low. Not always fully active as waste of energy. Always background expression (for lac Y - B-galactoside permease) to enable detection of lactose in cell Lac operon: 3 structural genes, promoter, terminator, regulator and operator lacZ = B galactosidase (cleaves lactose into glucose and galactose) lacY = B-galactoside permease - transmembrane symporter pumps B-galactosides into cell lacA = B-galactoside transacetylase - transfers acetyl group from acetyl-CoA to thiogalactoside In absence of lactose, repressor is bound to the operator, repressing transcription (by blocking DNA dependent RNA polymerase), albeit imperfectly = background expression Presence of lactose (but not glucose), binds to repressor and inactivates it = transcription
47
Types of RNA secondary structures
Helices Hairpin loop Bulge loop Interior loop Junction/Intersection
48
What is the cre-lox recombination
Gene editing (deletion/inversion and translocation) technique that allows for spatiotemporal control. Derived from P1 bacteriophage. Cre Recombinase = 38kDa enzyme that recognises Lox sites - catalyses recombination between them: - if LoxP same direction = deletion; opposite directions = inversion; interchromosomal recombination = translocation Used in lineage tracing as reporter gene (GFP) can be activated by Cre to track cell populations (using cell-specific promoters)
49
What is the Bayesian Pairwise Analysis
Statistical approach used to model pairwise relationships between 2 variables, incorporating PRIOR KNOWLEDGE or ASSUMPTIONS about the data distribution. Involves updating PRIOR BELIEFS with new evidence to form a POSTERIOR DISTRIBUTION
50
Concept of Locus (Loci), Alleles, LD, Tag SNP and Haplotype NB!!!
A LOCUS is a specific location on the chromosome. Depending on context, it refers to either a specific base-pair position for Single-Base Pair Locus (eg rs-113488022 at chr14:23822594), or Multi-Base Pair Locus (referring to an entire gene - gene locus, structural variants - SV locus, QTLs - eg eQTL locus, or GWAS-locus). The ALLELE is the version of the gene at a specific LOCUS, and also depends on context: for Single-Base Pair Locus, it is the specific SNP (A,T,C or G) at that location, or for Multi-Base Pair locus, it refers to the entire SEQUENCE of nucleotides. LD - Linkage Disequilibrium - refers to the NON-RANDOM association of ALLELEs at different LOCI in a population, where certain ALLELES (eg SNPs) are inherited more often than expected by chance due to physical proximity on a chromosome (due to CROSSING OVER during meiotic recombination). The TAG-SNP is a representative SNP in a region of genome with high LD that represents a group of SNPs called a HAPLOTYPE. Therefore a Haplotype is a group of SNP in high LD with each other.
51
Concepts of fine mapping, SNP enrichment and Colocalisation studies
Fine-mapping: methods aimed to define causal VARIANTS SNP-enrichment: prioritise disease-relevant CELL TYPES Colocalisation: nominates likely target GENES