describe the differences in genetic diseases in foetuses, babies and children, and adults and the elderly
Foetuses: Up to 50% of all conceptions result in miscarriage
- Chromosomal defects not compatible with life
New-born babies and children: 0.1% chromosomal eg Downs Syndrome
~5-6% single gene rare diseases
Mitochondrial
Adults and the elderly: Common diseases
Genetic and environmental
Many alleles
Complex
explain the phrase ‘genetic disorders are individually rare but collectively common’
each individual genetic disorder is by itself rare, however collectively, genetic diseases are common.
what pattern of inheritance do single gene disorders follow?
mendelian inheritance
state the differences in single gene disorders and complex disorders
Single gene disorders: Simple ‘Mendelian’ Inheritance Individually rare but collectively common High penetrance Tests are predictive Alleles at single genes Complex diseases and phenotypic traits: Runs in families, no simple inheritance pattern Common Influenced by environment No reliable tests yet Susceptibility not deterministic Risk alleles at many polygenes
how many kb long is mitochondrial DNA?
16
what are the problems associated with mitochondrial disorders?
Multi system failures: Basal ganglia of brain Heart Endocrine system, especially pancreas Sight and hearing Skeletal muscles
define heteroplasmy
A mixture of wild-type and mutant mitochondrial DNA within the cell.
what is the effect of heteroplasmy in a mother’s children?
as mtDNA is maternally inherited when a mother has heteroplasmy it can result in variable phenotypes (severity) within offspring or within one organism.
what is the significance between the disease phenotypes of damaged mtDNA and aging?
mtDNA damage results in multisystem failures eg sight/hearing loss. This is similar to the effects of aging, this suggests that the accumulation of mtDNA damage may be involved in aging.
how many base pairs are there in the human genome? what % of the genome is protein encoding sequence? how many protein encoding genes are there?
3.2gb (3.2x10^9) bp
protein encoding sequence: ~1.1% genome
~20,500 protein encoding genes
at what point in protein synthesis are genes spliced?
genes transcribed into primary mRNA. this undergoes splicing to form mature mRNA which is then translated (mostly)
describe the significance of NGS in genome sequencing
NGS takes a few hours and is less than $1000
100,000s genomes now being sequenced
Whole genome sequencing massive amount of info
Whole exome sequencing just the protein encoding sequences (1% of the whole genome)
describe the different types of single gene disorders (ie sex-linked/autosomal etc)
1) autosomal recessive (both copies are defective and the parents are unaffected heterozygotes 2) autosomal dominant One copy of the gene mutant Gain of function or haploinsufficient 50% transmission from affected parent But often occur de novo 3) Sex-linked Generally only affects boys passed on through the female line (X-linked)
what is a de novo mutation?
A genetic alteration that is present for the first time in one family member as a result of a mutation in a germ cell of one of the parents, or a mutation that arises in the fertilized egg itself during early embryogenesis.
describe the class III, IV and VI cystic fibrosis mutations and the medication that treats them
Class III: reduced gating - the CFTR gate is closed (Gly551Asp; Phe508del)
Class IV: reduced conductance (restriction of movement of Cl- through channel)
Class VI: high surface turnover (Phe508del) (CFTR internalised and degraded too rapidly)
Medication:
Potentiators - Ivacaftor (forces the gate to be open)
describe the class II cystic fibrosis mutations and the medication that treats them
Class II: Misfolded and degraded (Phe508del) Correctors - Lumacaftor (correct folding)
describe the class I and V cystic fibrosis mutations and the medication that treats them
Class I: no CFTR (Gly542X)
Could get ribosome to not read STOP - would result in protein that works slightly
Class V: v low CFTR levels (3849 + 10kb C→T mutation 10kb into intron affects splicing)
Medication:
Production correctors - Ataluren
what is Orkambi?
a combined drug therapy of lumacaftor and ivacaftor
which may be a more effective treatment for CF
what is NICE and what do they do?
The National Institute for Health and Care Excellence
they weigh benefit of cost and effectiveness with drugs
what is ‘standard of care’ and how is it used in drug trials?
SOC (standard of care): the treatments already available - what you compare new treatment to
what is ppFEV?
predicted forced expiratory volume
define: annual rate of exacerbations
how often patients have to go into hospital due their condition and its related conditions
define quality-adjusted life years and explain how its used in drug trials
how many (good quality) years the patient survives for (has to be economically viable, eg incremental cost-effectiveness ratio (ICER) (£/QALY) has to be
describe the results of trials with ivacaftor and CF, how much it costs and who the treatment is approved for
~5% of CF patients have the mutation(s) that can be treated by Ivacaftor
Trials show 10% increase in ppFEV and a reduction to half in the rate of exacerbations, reduced sweat salt concs, reduced Pseudomonas infections and improved QoL
Costs:
- $30,000
- ICER £335-1,274,000
- Treatment approved in 2016 for children 2-5
what are the symptoms and associated problems with CF?
Symptoms:
thick, dehydrated mucous
chronic inflammation, overproduction of elastase and irreversible lung damage
Associated problems:
repeated bacterial infections by Staphylococcusaureus, Pseudomonasauruginosa
Pancreatic exocrine deficiency, diabetes, Congenital Bilateral absence of the vas deferens, congenital bowel obstruction, salty sweat (this can be used in diagnostics)
what are the majority of CF mutations
Phe508del
~15 mutations responsible for 1/2 remaining cases among Europeans
what does the CFTR do?
pumps Cl- ions across the plasmamembrane
what is the effect of the CFTR not pumping out Cl- ions?
Cl- stays in the cell, H2O enters the cells for osmoregulation. this means the pericillary layer isnt as large and the mucus on top of this cannot move
name 5 methods of treating the symptoms of CF
physio DNAse to reduce mucus viscosity antbiotics anti-inflammatories (eg steroid) mannitol spray to increase osmolarity of mucus
name 2 gene therapies for treating CF and the problems with each
viral vectors (uses virus to insert WT allele) - immune response prevents repeated therapy Liposomes - innate response to CpG (modified nucleotide) in vector
describe the trait pattern of Huntington’s disease, its incidence and its effects
late onset autosomal dominant incidence 1;6700 movement disorder - chorea personality changes cognitive decline weight loss
describe the neuropathology of Huntington’s disease
in the corpus striatum:
- neuronal death
- generalised atrophy
- general brain shrinkage
- multisystem CNS disorder
when was the Huntington’s gene mapped and cloned?
mapped: ‘84
cloned: ‘94
what is the mutation that causes Huntington’s disease?
trinucleotide repeat expansion in ORF
CAG repeated 6-35: normal
CAG repeated 40 times: adult onset
CAG repeated 70 times: juvenile onset
what does CAG encode? what is the effect of the expansion of this trinucleotide
glutamine - expanded forms insoluble protein inclusion body
what is the most abundant body tissue?
muscle
23%♀, 40%♂
what is the age of onset for Duchenne Muscular Dystrophy (DMD)
3-5yrs
wheelchair bound by 12
<30yrs die of respiratory failure
describe the morphology of skeletal muscle
cells called fibres run length of muscle striated appearance (orderly arrangement of actin and myosin)
how big is the DMD gene? which protein does it encode? what does this protein do? is it sexlinked or autosomal? what are most mutations? what is BMD and name its mutation
V large (2.4Mb) - 79 exons
Encoded protein called dystrophin
Dystrophin links F-actin to sarcolemma which is attached to basal lamina
Sarcolemma can’t repair damage => muscle degradation
sex-linked (on X-chromosome)
Most mutations are newly arisen deletions (exon deletions => frameshift => nonsense)
Exon 45 or 47 missing (PCR test shows this)
Becker Muscular Dystrophy: no frameshift ∴ no nonsense
describe antisense oligonucleotide treatment for DMD
Morpholino - chemically modified AON that base-pairs with mRNA so it can’t be translated
In DMD pre-mRNA the exons 49-50 can be removed which leads to a premature STOP codon (in exon 51), producing a frameshift mutation and therefore no dystrophin
If exon 51 is also removed then a shortened but functional dystrophin protein is produced
are there many or a few undocumented single gene disorders?
OMIM catalogues 8575 suspected Mendelian phenotypes of which 5225 are known at a molecular level. There may be many more that aren’t documented.
why identify variants (mutations) responsible for a single gene disorder?
Stopping repeated, invasive, distressing medical tests
Introducing appropriate treatment and stopping inappropriate treatment
Psychological benefits to affected and family
Scientific knowledge
describe the issues with allelic heterogeneity in diagnosing a disorder
Allelic heterogeneity: a similar phenotype is produced by different alleles within the same gene
therefore we don’t know the exact mutation that has caused the disease - this can be problematic in working out whether parents will pass on a disease to their children (2 diff mutations = no disease)
what is atypical disease presentation and how is it problematic in diagnosis?
phenotype is different to normal disease phenotype - the common symptoms/signs lead you to believe its another disease = incorrect tests therefore no evidence of disorder
explain why novel variants in a known gene are problematic in diagnosis
patient presenting with a disease may have a new/unknown mutation therefore the tests will no give proof of disorder
why choose whole exome sequencing and not whole genome sequencing when diagnosing diseases? name a disadvantage of this
Mendelian disorders so far are all in coding sequence
CF complex disease variants nearly all outside coding sequence
Mutation in exon disrupts splicing downstream
Cheaper
Fewer variants to analyse
WES will miss some causative alleles and WGS is becoming increasingly feasible
how can a causative variant of a SGD be identified after whole exome sequencing?
each individual will show ~20,000 variants after WES so:
Predicted effect on protein function
Disease allele will be rare – use population databases to see if this is the case
Pedigree information
Same allele in unrelated individuals with same disorder
Expert appraisal of biological relevance to disorder phenotype - (Recapitulation (the repetition of an evolutionary or other process) in model system/organism)
how can predicting the deleterious effect on protein function help in identifying the causative variant of a SGD?
Likely to lead to LoF: frameshift (FS), protein terminating variant (PTV), splice sites, exon deletion (how will that change the function eg polar substitution in transmembrane domain)
Sequence conservation across organisms
Assessed by computer programs - computer can list variants that may cause disease
how can the exome aggregation consortium help in identifying the causative variant of a SGD?
60,000 exomes and is used to identify v rare alleles
We all carry 100s of potentially harmful mutations
Population frequency – identify which alleles are very rare and which potentially harmful alleles occur frequently in apparently healthy individuals.
- Size of database is key in determining this
how can pedigree information help in identifying the causative variant of a SGD?
Allows a comparison in known genetic trait patterns
Recessive
- May be compound heterozygote (2 different mutant alleles at a particular gene locus 1 on each chromosome)
- Heterozygous parents
- Heterozygous siblings not affected
Consanguineous
- Same allele in each gene
- Only affected members of pedigree are homozygous
De novo dominant
- New allele mutation in gamete
- Not present in either parent
briefly describe the case study of the consanguineous autosomal recessive disease (include: her symptoms/her treatment/whether it was successful/the mutation she carried)
16yr old Saudi Arabian girl with complex symptoms shared with a total of 8 consanguineous relatives
Suggestive of complex neurotransmitter disorders:
- Dopamine: Parkinson’s symptoms, couldn’t walk, global development delay, decreased muscle tone
- Serotonin: sleep and mood disturbance
- Epinephrine – diaphoresis, temperature instability
Neurotransmitter levels normal, but dopamine breakdown products elevated in urine
Was treated by increasing dopamine levels – immediate worsening of condition
A 3.2Mb region of homozygosity was found:
- 8 genes in homozygous region, 1 gene associated with neurotransmitter function (transmembrane protein)
- Identified p.pro387leu (p. = protein location, 387th protein has been substituted) mutation in SLC18A2 encoding the VMAT2 dopamine transporter
- Highly conserved residue
- Homozygous in all affected but not in unaffected
- Not in >1000 patients with Parkinson’s
what does VMAT2 do? in the consanguineous relationship case study how was the patient treated?
VMAT2 transports serotonin and dopamine into presynaptic vesicles
Expressed WT and mutant VMAT2 in tissue culture – reduction in activity in mutant VMAT2
Treatment with dopamine receptor agonist (chemical that binds to a receptor and activates it):
- Resulted in dramatic improvement in the patient within 7 days and maintained for 32 months
describe the characteristics of a complex disease. give 2 other names for a complex disease
Relatives of affected have higher than population risk but no Mendelian inheritance
- Familial
Common major diseases
Complex genetic and environmental factors and complex interactions between and within each
2 other names:
- Multifactorial
- Polygenic
state the evidence for and against psychiatric disorders being genetic
44 risk variants for major depression
But only explains 1.9% of the population variation in liability
Top decile of risk variant profile 2.4x more likely to suffer depression than the bottom decile
Genes identified enriched in brain function and overlap with genes associated with other psychiatric disorders
Families share same genes but also the same environment, need to distinguish between these
define heritability
The amount of the observed variance in a population that can be attributes to genetic variance
how can phenotypic variance be calculated?
phenotypic variance = genetic variance + environmental variance
give the equation to work out total genetic variance/broad sense heritability (H2)
genetic variance / (genetic variance + environmental variance)
name the 3 components that make up genotypic variance
G = A + D + I
additive (A): the mean of 2 expressed alleles (eg tall allele + short allele = medium)
dominance (D): interaction between alleles that results in phenotypic expression that is not purely additive
interaction/epistasis (I): interactions between genes at different loci that act on the same characteristic
give the equation to work out narrow sense heritability/additive genetic variance (h2)
h2 = A/P h2 = A/(A+D+E)
how can we estimate heritability in a human context?
use of identical and non-identical twin studies
define monozygotic and dizygotic twins
MZTs – monozygotic twins – identical
DZTs – dizygotic twins – non-identical
describe the ACE model in studying twins
ACE model:
- Additive variance A
- Common environment C
- Non-shared environment E
give the correlation due to genetics; due to common environment and due to non-shared environment for MZTs
Correlation due to their genetics = 1 (same genetics)
Correlation due to their common environment = 1 (same environment)
Correlation due to their non-shared environment
Phenotype of MZTs (rMZ) = A+C
give the correlation due to genetics; due to common environment and due to non-shared environment for DZTs
Correlation due to their genetics = 0.5 (50% same genes)
Correlation due to their common environment = 1 (same environment)
Correlation due to their non-shared environment
Phenotype of DZTs (rDZ) = 0.5(A+C)
what is the equation that gives A (additive variance) using the phenotype values of MZTs and DZTs
Additive variance = 2(rMZ - rDZ)
give the equation to work out E (non-shared env)
E = 1-rMZ
what are association studies?
population studies
answers the question: Is there a particular allele that is more frequent in a population of cases (of disease) compared to a population of controls (not affected)?
not looking for an exact correlation, looking for small numbers
what are SNPs?
single nucleotide polymorphisms - single base changes in individuals
what is the minor allele frequency? what value does the MAF have to be to be considered a common allele?
The frequency of the SNP’s less frequent allele in a given population.
common = MAF >5%
MAFs used in HapMap project
name the 2 major alleles for complex disease
apolipoprotein E ε4
HLA locus DQβ1 self/non‐self
describe the effect of the presence of the ε4 allele
allele increases risk of Alzheimer’s disease
- Transports lipids round the bloodstream
- 16% gene pool have the ε4 allele that codes for apolipoprotein E
- 1 copy = 3x risk of general population
- 2 copies 8x population risk
describe the effect of the presence of the HLA locus DQβ1 allele
Allows immune system to differentiate between self and non-self – protein presented on cell surfaces
- Reside 57 Asp–>any other a/a = greatly increases risk of Type I diabetes
- 1 copy 6x population risk
- 2 copies 18x population risk
describe Genome Wide Association Studies
Identify panel of SNPs that span the genome ~1 million
Assemble a set of cases and unaffected controls
- Each subject examined for 1 million SNPs
- Microarrays or DNA chips (not genome sequencing) – specifically examine 1 million different positions in genome
Identify SNPs that have a higher frequency in the cases compared to controls.
- The SNP is associated with the disease
- The SNP is unlikely to be the causative allele – it marks the region of the genome where the real risk allele is located
what is the value for genome-wide significance? how was this value reached?
genome-wide significance p value = <10^-8
Nominal significance: result less than 1 time in 20 by chance (p=<0.05)
Each SNP = independent test
So million SNPs expect 10^6/20 = 50,000 false positive
what is the problem with using the genome-wide significance value? how can this be partially overcome?
false negatives
use a massive sample size - the smaller the risk allele contributes the more subjects needed (meta-analysis used)
how is the risk of an allele measured in GWAS?
using the odd ration (OR)
Most risk alleles OR<1.2
OR=1.2, 12 people with risk allele result in 10 people with disease
name the 2 genes and 1 gene family that when mutated increases the risks of breast cancer. how much does each mutation increase risk?
5% cases due to BRCA1/2 autosomal dominant alleles
- 60-85% risk (10-50x pop risk)
- High risk ovarian cancer
Other rare alleles with high risk (10x pop risk)
- Eg Li Fraumeni syndrome (TP53)
Genes in DNA repair pathway
- Moderate effect 2-4x pop risk
describe a Manhattan plot
Chromosome number along the bottom (autosome)
Each dot on the graph represents a SNP
The y-axis is the probability that the SNP is more common in cases compared to controls (-log of P value)
p=0.05 (nominal significance) covers a large number of SNPs
p=10^-8 (genome-wide) covers a small number of SNPs
what is genetic ‘dark matter’? how does it relate to GWAS?
the genetic material that is unknown or poorly understood. GWAS only explains a maximum of 20% of genetic variance
describe the law of diminishing returns in relation to the GWAS
as you have more subjects you have more power to detect those contributing SNPs, but there is a point at which number of subjects won’t influence the number of SNPs detected
what is a possible cause for the lack of genetic variance shown by GWAS?
V large number of risk alleles each with a v small effect
Can show the existence of more risk alleles in the grey area (between genome-wide threshold and nominal significance) of the probability spectrum, but can’t distinguish genuine alleles from false positives