Flashcards in Genomics Deck (195):
What is the difference between complex and simple genomes?
Simple genomes have no introns, not much repetitive content and are mostly protein coding
Why do prokaryotes have such small genomes?
Are limited by power - DNA replication costs energy, so limit on genome size depending on the power the organism can produce
When eukaryotes engulfed bacteria, they decoupled replication from genome size allowing larger, messier genomes to develop
What is the C-value enigma?
Genome size doesn't correlate to organism complexity (C-value is amount of DNA in haploid nucleus). Is resolved as very little of genome is protein coding in eukaryotes.
What is the structure of chromosomes?
Have a short arm (p = petite) and a long arm (q = the letter after p)
Have a centromere (where the kinetochore forms)
Have telomeres (for replication and stability, are conserved tandemly repeating sequences)
What are the different types of eukaryotic chromosomes?
Metacentric (centromere in the middle)
Submetacentric (centromere off centre)
Afrocentric (satellite p arms)
Telocentric (no p arms)
What are the different types of tandem repeats?
Mini satellites (10-100bp units), found in telomeric regions in humans
Micro satellites (1-6bp units), found throughout the genome, is the large majority of all repeats
Macro satellites (>100bp) difficult to analyse with PCR
Useful for fingerprinting and population genetics
What is satellite DNA?
Short, tandemly repeated sequences, including mini, micro and macro satellites. Named as they appear as a 'satellite' when centrifuging sheared DNA in a caesium chloride density gradient as they are AT rich
What are pseudogenes?
Genes that are inactivated due to mutation, (frame shift, nonsense) or regulation
Often occurs if the gene is non-essential, or there are 2 copies (second copy accumulates mutation)
What are paralogs?
Homologous genes separated by gene duplication - genes with a common ancestor (have been duplicated)
What are orthologs?
Homologous genes separated by speciation - e.g. pig working vitamin C vs human not-working vitamin C synthesis gene
What are examples of transposed sequences?
Transposable elements - DNA that can move around the genome
Processed pseudogenes - integration of cDNA back into a genome; has a poly A tail, no introns and no promoter
What are the characteristics of retroviral elements?
3' and 5' target sites for integration
Could interrupt a gene
Replicate DNA as they insert (target site duplication), disrupting gene expression
What are the characteristics of class I retrotransposons?
Copy and paste mechanism via and RNA intermediate
Type 1 are LTR: similar to retroviruses without env, don't form infectious particles
Type 2 are non-LTR: LINEs (reverse transcriptase, make up 21% of human genome, most are unfunctional) and SINEs (no functional protein, need other mobile elements to move)
What are the characteristics of class II transposons?
Cut and paste mechanism
Encode a transposes enzyme
Most are inactive (e.g. deletions)
What are processed pseudogenes?
Mature mRNA is reverse transcribed and integrated into the genome. Lacks promoter (so is dead on arrival)and introns and has a poly A tail
Often have 5' truncations due to low processivity of reverse transcriptase
Are dispersed throughout the genome (not near original gene)
Have target site duplication from insertion
What is the evolutionary story behind the IRGM gene family?
Immunity Related GTPase gene family
3 copies of the family in most mammals (humans only have 2)
50 million years ago, all but one copy was inactivated in monkey/great ape ancestor
24 million years ago, a retrovirus inserts at the start of a gene and forms a new promoter
12 million years ago, functional copy was fixed in gorilla, chip and human lineage
Today is expressed in several tissues in humans
Where is variation in genomes seen?
Base modification (e.g. methylation)
Chromosome structure (length, inversions, duplications, deletions)
How does variation arise?
Mistakes in replication and chromosomal recombination and segregation. Has to be inherited i.e. in the germline to persist
What are SNPs?
Single Nucleotide Polymorphisms (or Variants, SNVs). Can be a transition (purine to purine, pyrimidine to pyrimidine) or a transversion (purine to pyrimidine). Could also be a single nucleotide deletion. Arise due to natural mutation or exposure to a carcinogen
What are the consequences of DNA variation?
Most are neutral and tolerated (a lot of DNA doesn't encode protein; genetic code is degenerate so amino acid may not be altered; some amino acids can be interchanged). Some somatic mutations contribute to the changes seen in cancer. Occasionally there is positive or negative selection for a mutation
What are the most sensitive parts of the genome to mutation?
CpG dinucleotides that are subject to methylation. Methylated C can be deaminated to make a T. This can either be repaired (using the G on the other strand as a template) or fully converted to a T:A pair
What are CNVs?
Larger regions of DNA subject to duplication or deletion. They are evolutionarily important as sequences can diverge after a duplication. They usually arise due to non-allelic recombination (they are flanked by sequences with high homology).
What are the consequences of CNVs?
Pathways in which there is tight regulation of gene expression are most commonly disrupted such as control of foetal growth and brain development (or revealing a mutation on the 'normal' allele - loss of heterozygosity)
What is an example of non-sequence/DNA related mutation?
Epigenetic mutation. Could either be nucleic acid (e.g. methylating DNA) or protein (e.g. histone) modification
What are large chromosomal abnormalities?
Chromosomal deletions or unbalanced translocations that result in an allelic mis-balance of many genes, thus disrupting key pathways.
Normally arise through errors in chromosome pairing and segregation in meiosis and germ cell maturation
Why do diseases such as sickle cell anaemia and cystic fibrosis persist?
Whilst the homozygous mutation is detrimental, the heterozygous mutation aids survival of disease e.g. sickle cell anaemia and malaria
What are functional polymorphisms?
Variation in DNA with an impact on phenotype. Will either be in an open reading frame or in regulatory elements (e.g. promoters, enhancers, ncRNAs)
What is linkage analysis?
Used with pedigrees to identify genomic regions and loci responsible for a disease phenotype. Microsatelites and SNPs are used.
What is association analysis?
Uses SNPs (on a microarray) to see if SNPs associate with a specific allele/phenotype
How can we use variation?
Genetic maps - placing loci in relative order based on recombination events
Physical maps - compare to a reference genome
How can epigenetic mutation be studied?
Chemical modification to allow sequencing of methylated cysteine (either directly or through arrays)
Antibodies to histone modifications to pull down these areas of the genome and analyse with direct sequencing or arrays
How can we identify a SNP from a mutation?
Through SNP and mutation database
Individual labs contribute results from gene studies of patients and normal family members
Completion of a draft genome sequence
HapMap and 1000 genome projects
Improving sequencing technologies has increased throuhput
How can you create a SNP chip?
Know the bases that are variable in normal populations and have a database of all polymorphisms (1000 genome and HapMap project)
What is haploinsufficiency?
When one functional copy of a gene remaining (after mutation of the other) still causes a phenotype. This is rare in the population, as we all carry many mutations. Most haploinsufficient genes have a specific expression profile and are often involved in early development
What are the advantages of GWAS studies?
Include genes and loci that may not have been considered (e.g. if functions are unknown) in their analysis, unlike candidate gene approaches
What controls are necessary in GWAS studies?
Case-controol statistical analysis
This relies upon having 2 groups from the same population, rigorous phenotypic analysis (avoid phenotypes that can have a phenocopy - where the same phenotype can be achieved through many gene combinations) and if appropriate also match age and sex
p values must be adjusted e.g. by dividing the genome wide p value by the number of markers used (or more complicated things) to reduce false positives
What novel approaches can be applied to GWAS?
Simplifying the phenotype by being quantitative
Grouping genes to spot common themes or pathways
Combining different genetic analysis techniques to improve confidence
What are the drawbacks of GWAS?
If there are many loci, need larger studies and larger sample sizes to reach statistical significance
Only looks at DNA sequence as opposed to epigenetic mutations
Doesn't tell you about parental origin of the locus
How do micro satellites form?
Slippage of polymerase - strands dissociate and a stem loop forms, resulting in expansion or contraction
How can chromosomes be prepared?
Cells must be dividing and arrested. Cells are swollen osmotically to spread the chromosomes. Cells are fixed to glass slide and chromosomes are stained for identification
When are chromosome preparations used?
Blood or tissue samples are used in post natal diagnosis e.g. bone marrow in leukaemia studies
Amniotic fluid etc are used in prenatal diagnosis
Sperm are used in fertility studies
What is amniocentesis?
Using amniotic fluid cells for pre-natal diagnostics
What is chorionic villus?
Finger like projections of the placenta into the uterine wall. Used in pre-natal diagnostics of high risk pregnancies as can be done earlier than amniocentesis
What are the advantages and disadvantages to different pre-natal diagnostic techniques?
CVS - placenta often has a different chromosomal constitution to the foetus
Amniotic preps are good, can only be done at 14 weeks
New technology allows analysis of foetal DNA in the mothers blood
How can chromosomal preparations be stained?
G-banding, using a giemsa stain and trypsin digest (cuts grooves at AT rich regions). Gives dark AT rich and light GC rich bands
FISH (fluorescent in-situ hybridisation), using DNA probes with fluorescent markers to light up complimentary regions. Strands must be separated (e.g. use of form amide)
What are the applications of FISH?
Multicolour chromosome banding
Counting chromosomes in nuclei
How does chromosome painting work?
A few dyes can give many colours due to overlapping sequences. Gives different colours for all chromosomes
What is karyotyping?
Taking a photograph of G-stained chromosomes and separating and pairing them up. Allows observation of any large-scale abnormalities and gene mapping
What are the different examples of chromosomal disorders?
Numerical abnormalities (aneuploidy, polyploidy - usually embryonic lethal)
Structural abnormalities (deletions, duplications, insertions, unbalanced translocations - all severe; balanced translocations, inversions, Y chromosome deletions - mild symptoms, may lead to infertility)
What chromosomal numerical abnormalities are commonly found in humans?
Trisomy - one extra chromosome. Get 21, 18 and 13 along with sex chromosomes in live births
Monosomy - only see monosomy X in live births
What is down syndrome?
Trisomy of chromosome 21. Most common aneuploidy in live births
What is Patau syndrome?
Trisomy of chromosome 13
What are the different sex chromosome aneuploidies?
XO - Turner syndrome: short, webbed neck
XXY - Klinefelter syndrome: get breast development
XYY syndrome: very tall, mental retardationn
All are infertile
How does aneuploidy arise?
Failure of chromosomes to disjoin properly at firs division
What are the different types of deletions in chromosomes?
Terminal deletion - only requires 1 break point
Interstitial - requires 2 break points
Often get serious clinical features
What are the consequences of deletions on the Y chromosome?
Has very few genes and a lot of 'junk' DNA (evolved from a fully functional chromosome)
Only has genes to do with spermatogenesis and 'male-ness'
Deletions often result in infertility but clinical features aren't too severe as there are very few genes
What are the clinical implications of duplications and insertions?
Extra DNA so leads to severe clinical features
Duplications are extra piece copied next to the original
Insertions are extra pieces inserted from another chromosome
What is the difference between unbalanced and balanced translocations?
Unbalanced - loss or gain of genetic material leads to partial trisomy or monosomy. Severe clinical abnormalities
Balanced - no net gain or loss of genetic material, so usually no clinical effect unless a gene is disrupted. Risk to offspring (often get unbalanced translocations or infertility) as meiosis is messed up - reduced recombination in pairing cross, unbalanced gametes produced
What are the types of balanced translocations?
Robertsonian translocation - end to end fusion of acrocentric chromosomes
Reciprocal translocations - breaks in 2 chromosomes and fusion of one to the other. Important in cancer cells. Can be detected with chromosome painting
How do inversions arise?
2 break points, piece in-between inverts. Can be paracentric (no centromere) or pericentric (centromere involved).
What are the clinical consequences of inversions?
No clinical features unless a gene is disrupted. Can lead to reduced fertility due to messing up meiosis - reduced recombination within pairing loop, producing unbalanced gametes that may not develop
What are the risk factors in gamete aneuploidy?
Eggs - age
Sperm - age, smoking, chemotherapy
When are patients referred for cytogenetic testing?
Is expensive, so must be relevant
4-12 weeks gestation - observe spontaneous abortions here, trisomy and unbalanced rearrangements
12 weeks to term - abnormalities picked up on ultrasound/if at risk (e.g. older mothers or a family history or if balanced rearrangement in one of the parents)
Neonatal period - if have congenital abnormalities. Looking for trisomy, unbalanced translocations, deletions etc
Early development - if no meet milestones. Subtle chromosomal abnormalities e.g. fragile X
Puberty - inappropriate sexual development
Infertility and reproductive failure - balanced rearrangements
As part of a study
What are the problems with pre-natal cytogenetic screening?
Mosaicism - some cells normal, some not
Contamination of maternal cells
Risk to the foetus - so target screening to at risk groups
What are the outcomes following pre-natal cytogenetic screening?
Offered choice of abortion
Prepare for affected child
What alterations in chromosome position are associated with development or disease?
X inactivation - X at the periphery
Random arrangement in senescent and quiescent cells
Sex chromosomes to middle during spermatogenesis
Chromosome 18 to the centre in cancer
How are genes positioned on chromosomes in the nucleus?
Active genes tend to be towards the edge of chromosome territory for access to transcriptional machinery
Near foci of DNA polymerase II
Active genes towards nuclear centre
What is polygenic inheritance?
Traits/diseases caused by the impact of many different genes each having a small individual effect on a phenotype
What are quantitative traits?
All individuals can be placed on the spectrum based on a defined value
What are threshold traits?
Traits in which individuals must carry a sufficient number of risk alleles to have the phenotype
What are the examples of model free, non-parametric analysis?
Linkage analysis - affected siblings or extended pedigrees
Homozygosity maping - specific type of link analysis in pedigrees (founder effect)
Transmission disequilibrium test - pedigree based association study
Association mapping - population based
What are the principles of model free linkage analysis?
Looking at whether affected relatives share a chromosomal segment more often than would be expected - shared segment analysis
Don't need to specify the mode of inheritance, number of loci, gene frequency or penetrance
What are the advantages and disadvantages of model free linkage analysis?
Can use smaller family clusters
More robust to errors (no model to have errors in assumptions)
Less powerful - need more individuals for statistical significance
What is identical by descent, and how can it be determined?
Identical by descent is determining which parent the phenotype is inherited from, and also which allele of which parent. If the parent is heterozygous (A/C), and the other is homozygous (C/C) and the allele is inherited from the heterozygous parent, then it is clear which allele must cause the disease in heterozygous children (A/C). Otherwise, it is necessary to use markers with multiple alleles (SNPs, micro satellites etc).
How does sibling pair analysis work?
2 siblings each with a disorder. Looking at allele inheritance from heterozygous parents (A/C)- if there is no linkage, then ¼ will be homozygous for A, ½ will be heterozygous and ¼ will be homozygous for C. If there is a difference in this, it suggests linkage of the allele with the disorder. Want to be able to estimate identical by descent sharing - know which allele links with the disease. To do this, highly polymorphic markers are required
How can model free linkage analysis be extended beyond sibling pair analysis?
Can look at affected pedigree member analysis - calculate the fraction of genes shared between members of a pedigree and work out the null hypothesis for pairwise comparisons
What is homozygosity mapping?
Searching for shared homozygous segments (both alleles being inherited from a common ancestor)
What is autozygosity mapping?
Homozygosity mapping with small, interrelated family pedigrees
What can homozygosity mapping be used for?
Rare recessive conditions
What is a transmission disequilibrium test?
Looking at whether a heterozygous parent is more likely to pass on one allele compared to the other
Avoids false association due to differences in the population
What are the advantages and disadvantages of linkage studies?
Good at detecting genes of large/medium effect
For genes with a weak effect need a very large sample, but this is computationally difficult
What are the advantages and disadvantages of association studies?
More powerful than linkage for small gene effects
Suitable for high throughput genotyping
Need to be aware of false associations e.g. population differences
What can't linkage analysis and association studies pick up on?
Parent of origin effects (e.g. imprinting)
Epigenetics (DNA methylation and histone modification
What is endocrine signalling?
Long distance signalling in via the blood stream
What is paracrine signalling?
Short range signalling - a cell talking to its neighbours
What is autocrine signalling?
A cell talking to itself
What are CNVs?
>1kb segment of DNA existing in varying copy number between individuals in comparison with the reference genome. Doesn't include insertions, deletions or transposable elements. Found all over, though some regions of the genome are more prone than others. Can be implicated in disease susceptibility and normal phenotypic variation
What is a segment duplication?
A large region of DNA being duplicated. Lower copy number, larger and less frequent than CNVs
How can CNVs be detected?
Microarray based on comparative genomic hybridisation. Reference DNA and test DNA are dyed in 2 different colours, then cut and combined and allowed to for dsDNA segments. DNA where there is more than expected will show as a different colour than the 2 combined.
What are CNVRs?
Copy number variable regions. Comprised of overlapping CNVs from different individuals.
Describe the process of recombination
Homologous chromosomes pair in meiosis
Double strand breaks form
Cut strands invade and form Holliday junction
Cross over resolves as heteroduplexes with or without recombinants depending on which strands are cut
Repair of mismatches in the heteroduplex region can lead to gene conversion
What is non-allelic homologous recombination?
Where recombination occurs between 2 non-allelic sequences (e.g. multi copy sequences). Results in duplications/deletions, translocations, inversions, conversions
What is non-homologous end joining?
Repairing double strand breaks in DNA; breakpoints are directly ligated with no need for a template. 2 types: D-NHEJ and B-NHEJ
What is D-NHEJ?
Restores genomic integrity without ensuring sequence restoration. Scaffold of Ku70/80 binds the double strand break; DNA-PKcs allows access to the free ends; Artemis nuclease trims ends; Ligase IV complex joins free ends
What is B-NHEJ?
An alternative pathway of end joining operating with slower kinetics and mainly as a back up for D-NHEJ. Scaffold of PARP binds the double strand break; complex proteins that are less efficient than D-NHEJ join the free ends
What is the importance of NHEJ
Non-homologous end joining is critical in VDJ recombination. Error prone repair is good as it maximises diversity. If there are mutations in NHEJ pathways, then patients can't produce functioning T or B cells
How does V(D)J recombination work?
DNA is cleaved at a recombination signal by RAG1/RAG2 nuclease. DNA is opened and D-NHEJ begins (artemis nuclease etc)
What are the outcomes of NHEJ?
Can be intrachromosomal (resulting in small indels, deletions, inversions or complex rearrangements), inter-homologous chromosomes (resulting in deletions/duplications or complex rearrangements) or inter-non-homologous chromosomes (resulting in translocations)
What are the differences between NAHR and NHEJ?
Non-allelic homologous recombination and non-homologous end joining. NAHR breakpoints are clustered whilst NHEJ breakpoints are scattered. NAHR involves low copy repeats whilst NHEJ involves LINE elements. NAHR requires transposons/minisatellites whilst NHEJ requires triplet repeats/telomeric repeats. At an NAHR event, you may observe gene conversion whilst at an NHEJ event you may observe added bases at junctions
What is an evolutionary break point region?
Enriched for structural variants, copy number variants and single nucleotide polymorphisms. As CNVs drive NAHR/NJEH (and NAHR/NHEJ drives CNV formation), they must be linked to structural rearrangements which must be linked to evolutionary breakpoints. They are also often close to cancer breakpoints.
How are EBRs associated with cancer?
Evolutionary breakpoint regions are often close to cancer breakpoints. Tumour breakpoints are characterised by clusters of paralogs and retrotransposed pseudogenes
What are homologous synteny blocks?
Regions of the genome where the order is conserved between species (I think). Different gene content to evolutionary break point regions. Enriched for ancient conserved pathways e.g. Notch and Wnt signalling, HOXD, voltage gated sodium channels etc. Breakages in these regions may lower fitness
What genes are found around evolutionary breakpoint regions?
Genes involving responses to external stimuli. Some chromosome rearrangements may have adaptive value
How does nuclear organisation affect evolution?
Where chromosomes are positioned in the nucleus dictates whether they will be able to interact. There is conservation of 3D order between human and mouse genomes - 2 sequences adjacent in the mouse genome but distant in the human genome will often be found in close proximity within the nucleus.
What is interesting about muntjac karyotypes?
In the Reeves's muntjac, 2n = 46. In the Indian muntjac, 2n = 6 (female)/7 (male). This is due to extensive fusions of ancestral chromosomes with interstitial telomeres marking fusion points. Indian and chinese muntjac can produce sterile offspring, showing that large changes to the genome structure may have few consequences for genome regulation and expression.
How do sex organs develop in the embryo?
The Wolffian duct (male) and the Mullerian duct (female) are the initial structures, and both exist. If the gonad is removed, the Wolffian duct regresses and the Mullerian duct persists (default state is female). In the presence of testosterone, the Wolffian duct persists but the Mullerian duct doesn't regress - there is a second hormone, AMH (anti mullerian hormone) that represses email development.
How do we know that Y is sex-determining?
Turner syndrom - XO karyotype - affects females and results in no/very small ovaries and short stature
Klinefelter syndrome - XX(n)Y karyotype - affects males and results in sterility, small testes, tall, may be some breast development
How was the gene for maleness found on the Y chromosome?
By looking at males with an XX karyotype. A mistake in recombination occurred that allowed the X and Y to recombine in which the Y chromosome lost the gene for maleness and the X chromosome gained it. Found the gene Sry.
What is Sry?
The gene for maleness. Encodes a DNA binding protein and acts as a transcription factor/assists other transcription factors. Regulates genes for maleness e.g. production of AMH.
What are the differences between the X and Y chromosomes?
Y is small, degenerate and gene poor whilst X is large and gene rich. Y has a variable gene content between species whilst X is highly conserved across mammalian species
What are the similarities between the X and Y chromosomes?
Enriched for repeat sequences (Y more than X)
Both chromosomes have many amplified gene families, some in large palindromes
What is the key switch that determines male or female?
FoxI2 (ovary determining) vs Dmrt1 (testis determining). Each down regulates the other. Is important for maintaining male/female choices triggered by other genes (e.g. Sry in mammals, others in other species)
Describe the crossing over of sex chromosomes?
They are heteromorphic (different in size, sequence and structure), so don't cross over down the majority of their length. Short regions of identity are retained where crossing over is obligatory to pass cell cycle checkpoints to ensure correct segregation of homologs
What are the 2 key factors when considering degeneration of the Y chromosome?
Reduced population size - quarter of the copy number compared to autosomes
All genes in the non-recombining region are linked
This results in a greater random chance effect. Deletions spread and become fixed.
What is Muller's Ratchet model for explaining Y chromosome degeneration?
Stochastic loss of the Y chromosome haplotype with the lowest mutational load. High mutation rate and inefficient selection means mutations persist and can't be repaired (no homologous recombination). Therefore genes are gradually lost.
What is the background selection model for explaining Y chromosome degeneration?
Newly-arising weakly beneficial mutations can't escape from the haplotype they arose on and may be lost if the background has a high mutational load. Beneficial mutations can't be recombined.
What is the hitch-hiking model for explaining Y chromosome degeneration?
A strongly-selected beneficial mutation can drag many other weakly deleterious mutations to fixation
How did sex chromosomes evolve?
Environmental sex determination (the ancestral state) becomes superseded by a dominant sex determining allele (random or due to environmental change)
Recombination between the incipient sex chromosomes becomes suppressed in the region around the sex determining allele (arises to preserve a favourable gene combination e.g. linkage between sperm production and maleness genes. Chromosome inversions are a factor to prevent recombination)
What genes are retained on the Y chromosome despite degeneration?
Genes with direct benefit to maleness
Housekeeping genes where dosage is critical
Why doesn't the Y chromosome disappear?
Amplification of Y genes by unequal sister chromatid exchange
Acquisition of new content e.g. direct transposition from other chromosomes/via the recombination region
What is the PAR region of the Y chromosome?
Pseudo autosomal region. Bit that still does crossing over
Why does the Y chromosome have variable content between species?
They Y chromosome can acquire new content through direct transposition or recombination at the PAR region.
New content is subsequently degraded
The Y chromosome grows by additions to the PAR and different genes are lost through degeneration, resulting in variable content.
What do deletions in Yq result in?
Azoospermia - lack of sperm (oligozoospermia is very few mature sperm made). Deletions in different regions result in different phenotypes. 3 regions defined: AZFa,b,c, each with specific phenotypes associated. B and C have multiple breakpoints and therefore a range of phenotypes
Why is identifying important genes for maleness on the Y chromosome difficult?
Deletions are large and tend to involve multiple genes
Genes are often part of multilane families so identifying which member is associated with the phenotype is challenging
Failure to find point mutations
What does the RBM gene family do?
X-Y homologous gene
Many members on the Y chromosome - has been amplified. Present in many species - ancient
Functional copies in the AZFb region
Has an RNA recognition motif, has a function in RNA processing
Expressed in the nucleus of transcriptionally active germ cells, although in mice it isn't expressed until late in development - functional divergence?
What does the DAZ gene family do?
On the Y chromosome in primates - recent autosomal recruits
Polymorphic in copy number and order of repeats
Has an RNA binding motif, may have a role in RNA stability
What are the solutions to the dosage problem?
Dosage problem is the differing number of X chromosomes in male and female - in males the 1 X chromosome has to do double duty. This is solved by:
Increasing expression from X (drosophila)
Halving expression from both X's (C. elegans)
Inactivating one X at random (placental mammals)
Inactivating the same X (marsupials)
What is 'sex chromatin'?
Barr body observed in females and males with Klinefelter syndrome (XXY). Replicates late - suggests that it is in heterochromatin
How is the X chromosome inactivated?
There is an X inactivation centre which spreads silencing (found through X:autosomal translocations and observing propagation of inactivation). Xist is a gene transcribed from the inactivated X chromosome - mRNA is non coding but is spliced and polyadenlyated. It is retained in the nucleus and coats the inactive X chromosome
Why are there many LINE elements on the X chromosome?
Spreading of inactivation is facilitated be the presence of LINE repeat elements particularly full length LINEs.
What genes on the X chromosome escape inactivation?
PAR genes (present on both X and Y)
Some housekeeping genes that haven't been lost from the Y - therefore females still need 2
Genes on the short arm are more likely to escape inactivation as there is difficulty spreading through the centromere
Why is the X chromosome similar across species?
Once dosage compensation is established, sequences from other chromosomes would find it hard to move to the X as this would effect their dosage.
How have genes involved in ovarian failure been discovered?
Must be genes involved in ovarian failure necessary on both X chromosomes as people with Turner syndrome (XO) have it. Look at people with early menopause, find CNVs that are not known to be common. Examine these regions and find some genes.
What is the significance of the non-ovarian phenotypes observed in Turner syndrome?
Turner syndrome (XO) is protected against by having either and X or a Y. Suggests that genes involved in non-ovarian phenotypes are present on both X and Y chromosomes. Could be in the PAR region (there is a homeobox gene there), or in the recent X-Y transposition from an autosomal chromosome, or in the skeletal growth locus.
What are the selective pressures on the X chromosome?
⅔ of all X chromosomes are found in females, causing accelerated selection for female-benefit genes
X is hemizygous in males and functionally hemizygous in females (on a per cell basis) meaning there is more selection against recessive mutations.
Therefore, the X chromosome is enriched for highly selected functions such as brain development and intelligence
Why are amplicons so common on the sex chromosomes?
Explanation 1: Allows for hairpin loop formation and gene conversion to repair a damaged gene from a remaining functional copy. Repeat structures help maintain amplified genes. (Also means harmful mutations can propagate if it goes the other way). Also doesn't explain X chromosome amplicons which can do recombination
Explanation 2: palindromes have a role in gene regulation (are often unregulated in cancer). Sex chromosomes are silenced during meiosis and limited reactivation occurs afterwards, preferentially involving amplicon sequences (Doesn't explain whole reactivation story or the scale of reactivation of some genes)
Explanation 3: Selfish genes are in a runaway arms race - X linked genes distort offspring sex ratio to female, Y linked genes repress distortion. Copy numbers increase in an arms race
What is locus heterogeneity?
Where many genotypes can lead to the same phenotype. Common if different genes are involved in the same pathway or there can be multiple mutations in one gene.
What is pleiotropy/incomplete penetrance/variable expressivity?
One genotype leads to many phenotypes. Is a challenge for genetic mapping
What are the strategies for gene discovery?
Hypothesis driven - look for mutations in a candidate gene
Hypothesis free - mapping and discovering
Hypothesis free - genome wide discovery (looking at karyotyping and CNVs)
With the current strategies, what kinds of genetic disease do we miss? And how can this be improved?
Rare recessive - e.g. if only one family member is affected, can't be mapped
Dominant reproductively lethal - often de novo, no pedigree possible
Diseases with locus heterogeneity and/or lack of phenotypic specificity - no common mutation, may be classed as different diseases
Large collections of patients and their families, cost efficient genome sequencing and comprehensive, accurate functional interpretation of variation can improve this (all of which are starting to appear)
How much mutation in the genome exists?
Loads. Inherit a lot from parents, and develop more as you progress through life - gives a parental age effect. De novo mutations hit ~2 genes, most of which have a functional impact. The mutational load limits the essential content of the genome.
What kinds of mutation are commonly associated with disease?
Missense/nonsense mutations are the most common
Also have high levels of Indels (frame shifting)
Some CNVs and splicing
Very few regulatory and rearrangements
What is targeted sequencing?
Making a shotgun library, then using synthetic oligonucleotides that hybridise exons to pul down the 'interesting' bits of the genome, which can then be sequenced. Can target the exome or methylation or others
What does exome sequencing miss?
Poorly captured exons
Sequence variants that create new exons
Structural variants (e.g. duplications)
How many patients are needed to find rare diseases?
Not very many IF a single gene is responsible
What is mendelian inheritance?
Traits linked to single genes on chromosomes with each parent contributing one of two alleles
Traits are inherited separately
If genotypes of parents are known, the distribution of phenotypes in the offspring can be determined
What is non-mendelian inheritance?
Any pattern of inheritance in which traits don't segregate in accordance with Mendel's law
What are the possible reasons for non-mendelian inheritance?
Extra nuclear inheritance
Triplet repeat disorders
Transgenerational transmission of epigenetic traits
What is co-dominance?
When the contributions of both alleles are visible in the phenotype. E.g. blood group, coat colour
Both alleles are expressed equally in a heterozygote
What is allele exclusion?
Only one allele of a gene is expressed, the other is silenced. Could happen at the transcriptional level (one allele is not transcribed) or at the post transcriptional level (post transcriptional and post translational mechanisms leading to the elimination of one of the allele's protein products)
Where is allele exclusion seen in the immune system?
B cells are monospecific - only one of the two IgH alleles is selected (randomly) to make a heavy chain. Only one of the four light chains is ever used.
What are possible mechanisms for allele exclusion?
Imprinting (dependent on parent origin)
X-inactivation and some autosomal genes (random)
What is extranuclear inheritance?
Cytoplasmic inheritance - the transmission of genes that occur outside the nucleus. E.g. eukaryotes inherit cytoplasmic organelles such s mitochondria or chloroplasts. Could also be from cellular parasites (viruses or bacteria)
Where is extranuclear inheritance seen in plants?
Leaf colour inheritance - Marabilis jalapa. Leaf colour is inherited from the maternal chloroplasts. In variegated plants, the leaf colour is dependent on the chloroplasts in the egg (white, green or a mix).
Describe the inheritance of the poky trait in Neurspora crassa mould
Maternal inheritance. The poky phenotype is slow growing and has abnormal cytochrome expression (no a or b, excess c). Thought to be due to mitochondrial inheritance through the maternal parent.
What is mitochondria disease?
Clinical symptoms are heterogenous (depends if diseased mitochondria are highly prevalent and where they are found - muscle and cerebrum is worst)
Dual gene expression from mitochondria and nucleus
Can get homoplasmy (all are diseased) or heteroplasty (both mutated and wild type)
What is gene conversion?
An event in DNA genetic recombination.
DNA sequence from one helix is transferred to another when a mismatch is recognised and corrected by the cellular machinery by copying the other allele
Occurs in repetitive DNA sequences
What is mosaicism?
2 or more genotypes
Mutation is acquired after fertilisation resulting in only some cells being effected
What are some common mosaicisms?
Most chromosome trisomy's (downs, edwards, patau)
Sex chromosome mosaicism (Turner, Klinefelter)
X inactivation as cells differentiate (e.g. Tortoiseshell cat)
What are triplet repeat disorders?
Disorders caused by the expansion of micro satellite tandem trinucleotide repeats
3 nucleotides due to slippage during DNA replication. Doesn't causes a frame shift unless a stop codon is put in, so can have no effect, can be really bad or can make protein better
What are some examples of trinucleotide repeat disorders?
Polyglutamine disorders are the largest - includes Huntington's disease and Spinocerebellar Ataxia (SCA)
Fragile X syndrome
Describe Huntington's disease
PolyQ track is detrimental to the properties of the HTT protein and compromises cell homeostasis. Affects transcriptional activity, vesicle trafficking, mitochondrial function and proteasome activity.
Describe fragile X syndrome
More than 200 CGG repeats affects the fragile X mental retardation gene
Increased methylation results in reduced expression of the gene
If have 55-200 repeats, have Fragile X tremor/ataxia syndrome
What is Lamarckian inheritance?
LOOK THIS UP
What is transgenerational transmission of epigenetic traits?
Inheritance through epigenetics between generations (normally epigenetics don't persist)
READ PAPER doi: 10.1126/science.1255903
What is imprinting?
Genes are expressed in a parent-of-origin specific manner. Genes are expressed from the non-imprinted allele. Genes can be silenced through DNA methylation, histone modifications or regulatory RNAs
What are some examples of epigenetic modifiers?
Nicotine, benzene, arsenic, viruses
Also folic acid and vitamin C
What can affect your epigenetic signature?
Diet, sleep, exercise, stress, behaviour, exposure to epigenetic modifiers.
It is malleable - can be changed and modified
How can genes be epigenetically silenced?
Non coding RNAs
What genes are imprinted and why?
Genes involving foetal growth, placental growth, suckling and nutrient metabolism.
A mechanism to balance parental resource allocation in the offspring
What are imprinted DMRs?
Differentially methylated regions due to imprinting. As methylation of genes is reset during the initial divisions of a zygote, these regions escape the reset.
Establishes transgenerational epigenetic inheritance
Describe imprinted DMRs
Cytosines are methylated.
Imprinted regions are normally found in clusters. Genes tend to have a CpG rich DMR (differentially methylated region) related to allele repression. Get allelic histone modifications. Often have a high number of tandem repeats and presence of CTCF and YY1 transcription factor binding sites and ncRNA transcriptional units
How did imprinting evolve?
Mainly found in placental (eutherian) animals.
Kinship theory - paternal genes drive foetal growth by extracting maximum resources from the mother whilst maternal genes ensure the mothers survival and equal allocation of nutrients between offspring
What is parthenogenesis?
Reproduction in which an organism develops from an unfertilised germ cell/development of a germ cell without fertilisation.
Komodo dragons can replicate without males
What is gynecogenesis in humans?
Parthenogenesis in which ovarian teratomas (dermoid cysts) form. All chromosomes come from the mother
What is androgenesis in humans?
Hydatidiform mole - a tumour in the uterus where all chromosomes come from the male partner. Usually an empty ovum is fertilised by sperm and the genome duplicates.
What are the types of epigenetic disease?
Embryo/Placenta (ovarian teratoma, hydatidiform moles)
Brain - behaviour, psychiatric, neurodegenerative
Growth - cancer, nutrient metabolism and many others
What is Prader-Willi syndrome?
An imprinting disorder due to abnormal imprinting at 15q11-13 (opposite is Angelman syndrome). Leads to weak muscle tone and floppiness at birth, poor suckling, obesity (overeating), hypogonadism (immature sexual characteristics), CNS and endocrine gland disfunction - varying learning disability
What is Angelman syndrome?
An imprinting disorder due to abnormal impritining at 15q11-13 (opposite is Prader-Willi syndrome). Developmental delay with severe speech impairment, behavioural uniqueness (laughing, smiling, hand flapping), seizures, microcephaly, movement or balance disorder
Describe the chromosome region associated with Prader-Willi/Angelman syndrome
Chromosome 15, q11-13. Single bipartite imprinting control region controls a whole gene cluster
Many are expressed only/preferentially on the paternal chromosome (including small nucleolar RNAs - snoRNAs). 2 genes are expressed on the maternal chromosome
How does Prader-Willi/Angelman syndrome arise?
PWS: loss of the paternal chromosome expression of snoRNAs (and other) (through deletion or through inheritance of 2 maternal chromosomes and no paternal or rarely through a translocation). The maternal chromosome is silenced, so genes such as snoRNAs are not made
AS: loss of maternal chromosome expression of UBE3A (through deletion or mutation in the gene region, inheritance of 2 paternal chromosomes and no maternal or rarely through translocation). In most tissues of the body, both paternal and maternal UBE3A is expressed, but in certain areas of the brain only the maternal copy is expressed.
What is psuedohypoparathyroidism type 1B?
A localised resistance to parathyroid hormone (PTH) mainly in the renal tissues. Manifests with hypocalcemia (low calcium), hyperphosphatemia (high phosphate) and elevated PTH levels. Hormonal resistance develops if the disease is inherited maternally but not if it is inherited paternally
Why does psuedohypoparathyroidism type 1B develop?
70% of cases have a methylation defect at the GNAS differentially methylated region, including hypomethylation of the maternal germ line allele GNAS exon A/B DMR. This is often caused by a micro deletion of non-imprinted STX16 (220kb upstream of exon A/B)
What are some examples of disease caused by mutation in epigenetic regulators?
Kabuki syndrome - atypical facial and skeletal features caused by defects in MLL2 (a HMT) and KDM6A (a HDMT - removes repressive histone methyl groups)
Rett syndrome - atypical growth and development in females (males not viable - X linked) caused by defects in MECP2 (methyl binding protein 2 - binds methylated cytosines and represses transcription)
How has the Dutch Hunger Winter (and other similar records) aided the study of epigenetic?
During the dutch hunger winter, good health records, lasted 3 months. Differences were found in babies that were effected early/late in pregnancy. Those malnourished in the first 3 months had less imprinting on the IGF2 gene and were at an increased risk of adult type 2 diabetes. Grandchildren had obesity and mental health issues - an example of transgenerational epigenetic inheritance
Also used the Swedish town of Overkalix - records of harvest and births. If males had famine years as sperm developed (pre-puberty), healthier grandsons resulted (compared to men with feast years)
How does diet effect epigenetics?
A diet high in nutrients required to make methyl (folic acid, B vitamins etc) allows us to rapidly alter gene expression (especially during early development when the epigenome is established) - want parents to have a rich methyl diet when pregnant
How are obesity and epigenetics related?
Sperm cells from obese men have distinct RNA and methylation patterns. These changed after surgery and induced weigh loss - suggested that genes controlling appetite are subject to methylation changes
How are psychiatric disorders and epigenetics related?
Low number of identified genetic changes identified - possible epigenetic influences
Susceptible in developmental period - risk of illness modified by epigenetic changes induced by environmental insults
How has the inheritance of epigenetic changes been observed in animal models?
Mice were exposed to acetophenone (strong smelling) and given a foot-shock at the same time. Behavioural sensitivity to acetophenone was observed for the next 2 generations
How is depression (MDD) linked to epigenetics?
Low heritability. Risks involve high/low birth weight, early life adversity, trauma, hormonal fluctuation.
See DNA methylation of MORC1 in early life stress and MDD
How is bipolar disorder linked to epigenetics?
High heritability, still some environmental influence (20%). Risk factors involve caesarean section, altered DNA methylation of various genes:
BDNF - brain derived neurotrophic factor
KCNQ3 - potassium voltage gated channel
Also some miRNA influence
How is schizophrenia linked to epigenetics?
Strong environmental component - 50%
Risk factors include small birth weight, maternal infection, malnutrition, advanced paternal age (methylation differences), hormonal changes, parental loss and substance abuse.
Changes in DNA methylation at RELN (encodes rerelin), SOX10 (a TF for development), several HLA genes.
Changes in histone modifications of several genes - increased levels of HMTs, altered levels of H3K9K14 acetylation (altering gene expression of GAD1, HTR2C, PPME1E)
How is Alzheimers disease linked to epigenetics?
Increased methylation of LINE-1 in alzheimers disease along with age dependent drift of a key enzyme regulating methylation of CpG islands. Genome wide disruption of 5hmC and miRNA significantly expressed
What is the effect of assisted reproduction technology on epigenetics?
Children are more at risk of imprinting disorders - embryo manipulation occurs during window of critical epigenetic reprogramming. Could be due to infertile patients having underlying epigenetic defects.