Linkage Analysis Flashcards
What is genetic variation?
• Genetic variation refers to differences in the DNA sequence between individuals in a population
How can variation arise?
• Variation can be inherited or due to environmental factors (e.g. drugs, exposure to radiation)
What effects can genetic variants have?
Alteration of the amino acid sequence (protein) that is encoded by a gene
Changes in gene regulation (where and when a gene is expressed)
Physical appearance of an individual (e.g. eye colour, genetic disease risk)
Silent or no apparent effect
Why is genetic variation important?
- Genetic variation underlies phenotypic differences among different individuals
- Genetic variations determine our predisposition to complex diseases and responses to drugs and environmental factors
- Genetic variation reveals clues of ancestral human migration history
What are the 3 mechanisms of genetic variation?
• Mutation/polymorphism: errors in DNA replication. This may affect single nucleotides or larger portions of DNA
Germline mutations: passed on to descendants, occurs in gametes and is passed on from parent to offspring
Somatic mutations: not transmitted to descendants. This occurs in a single cell of the body and is not inherited – depending on the gene effected it may lead to cancer
de novo mutations: new mutation not inherited from either parent. They occur spontaneously, either in one of the parental gametes or in the fertilized egg during early embryogenesis. They are not inherited, but can subsequently be passed onto the next generation
• Homologous recombination: shuffling of chromosomal segments between partner (homologous) chromosomes of a pair, resulting in new allele combinations. But importantly, this process can be utilised in linkage analysis to track the inheritance of chromosomal segments and determine the likely location of a disease gene
• Gene flow: the movement of genes from one population to another (e.g. migration) is an important source of genetic variation
Compare a mutation from a polymorphism
- A mutation is a rare change in the DNA sequence that is different to the normal (reference) sequence. The ‘normal’ allele is prevalent in the population and the mutation changes this to a rare ‘abnormal’ variant
- By contrast, a polymorphism is a DNA sequence variant that is common in the population. In this case no single allele is regarded as the ‘normal’ allele. Instead there are two or more equally acceptable alternatives
- The arbitrary cut-off point between a mutation and a polymorphism is a minor allele frequency (MAF) of 1% (i.e. for a variant to be classed as a polymorphism, the least common (minor) allele must be present in ≥1% of the population)
When does genetic recombination occur?
What is genetic recombination?
Genetic recombination occurs during prophase I, when the two homologous chromosomes (i.e. maternal and paternal) line up together.
Homologous chromosome pair with each other and undergo genetic recombination, in which DNA is cut and then repaired, which allows them to exchange some of their genetic information. A subset of recombination events results in crossing over, which creates physical links known as chiasmata between the homologous chromosomes.
These crossing over events result in the production of recombinant chromosomes, which are highly informative and can be utilised for linkage analysis studies.
What is crossing over?
What does it result in?
- Crossing over: reciprocal breaking and re-joining of the homologous chromosomes during meiosis
- Results in exchange of chromosome segments and new allele combinations
Define genotype, phenotype and alleles
- The genotype is the genetic makeup of an individual
- The phenotype is the physical expression of the genetic makeup
- Genes are found in alternative versions called alleles
Define homozygous, hetrozygousl, haplotype and locus
- A homozygous genotype has identical alleles
- A heterozygous genotype has two different alleles
- A haplotype is a group of alleles that are inherited together from a single parent
- Locus is any region in the genome
What are the 3 categories for genetic disease?
- For linkage analysis, we will be focused on Mendelian / Monogenic disease. These are most often rare diseases that are highly heritable within families. The term ‘monogenic’ means that the disease is caused by one gene, i.e. a mutation in a single gene is sufficient to cause the disease. ‘Mendelian’ refers to the inheritance patterns observed by Gregor Mendel.
- By contrast, Non-Mendelian / Polygenic diseases require ‘hits’ in multiple different genes. It is the cumulative effect of these multiple hits that leads to the disease.
- Whilst Multifactorial diseases result from the combination of genetic and environmental factors, e.g. someone with a genetic predisposition to heart disease may be able to counteract this with a good diet, exercise, low alcohol, no smoking, etc. whereas different lifestyle choices (drinking, smoking, poor diet, etc.) would be more likely to cause disease.
Describe the penetrance vs variant frequency graph
On image
At the other end of the spectrum is Polygenic (many genes) / Common complex disease. In this case, common variants each have low penetrance and it is the cumulative effect of multiple variants in different genes that cause the disease.
What is linkage analysis?
What is the main assumption of linkage analysis?
- Linkage analysis is a method used to map the location of a disease gene in the genome
- The term ‘linkage’ refers to the assumption of two things being physically linked to each other
The major assumption in linkage analysis: genetic markers that are in close proximity to our disease gene will be co-inherited together.
- Therefore, ‘linkage’ refers to physical proximity between two loci
For Linkage analysis what are the 2 types of genetic maps?
For linkage analysis, we use two different types of maps: genetic maps and physical maps
What do genetic maps provide?
Genetic maps tend to provide information about blocks or regions of a chromosome – this is similar to the zones on a tube map:
• We might say that we live in zone 3, for example – this provides some information on distance relative to another zone
• But the exact position of each station within a zone is not so important – this is the same with genetic maps
A genetic map shows the approximate map distance that separates any two loci and the position of these loci relative to all other mapped loci.
What do physical maps provide?
By contrast, physical maps provide more precise information on physical distance – this is similar to the tube stations
• In this case, the exact location of a station (relative to any other station on the line) is important
• We can use physical maps to calculate precise distances between two stations, as we know their exact location
Physical maps indicate the precise location of a specific locus (e.g. gene or genetic marker). Positions can be defined to the individual base pair, or more broadly by Megabase (Mb) positions.
Why can we use recombination frequencies to produce genetic maps of all the loci along a chromosome?
Because the frequency of recombination between two loci is roughly proportional to the chromosomal distance between them, we can use recombination frequencies to produce genetic maps of all the loci along a chromosome and ultimately in the whole genome
What is genetic linkage?
When are alleles likely to be inherited together?
- Genetic linkage is the tendency for alleles at neighbouring loci to segregate together at meiosis
- Cross-overs are more likely to occur between loci separated by some distance than between loci close together on the chromosome
- Therefore to be linked, two loci must lie very close together
- A haplotype defines multiple alleles at linked loci. These chromosomal segments can be tracked through pedigrees and populations
On image
What are the methods of genetic linkage?
Genotype multiple genetic markers across the genome
Genotype multiple family members from families with the genetic trait
Identify which genetic markers co-segregate with the disease (phenotype)
(i.e. which haplotypes are the same in all affected family members)
These genetic markers are therefore ‘linked’ to the disease gene
–> This indicates where in the genome the disease gene is likely to be located
NB: further work is needed to identify the gene and disease-causing mutation!
- Genetic markers are genotyped across the whole genome, for multiple family members (ideally from many different families)
- Using linkage analysis software, we can identify which genetic markers co-segregate with the disease or phenotype. This will be discussed in more detail in Part 2
- By identifying shared haplotypes in affected family members, we can determine where in the genome to search for the disease gene
Compare microsatellites and SNPS
On image
What is microsatellite genotyping?
Microsatellite genotyping is a PCR-based method that is used to amplify highly repetitive regions of the genome. PCR primers are located outside of the repetitive element and are used to amplify the full microsatellite region.
Different numbers of repeat units (i.e. CA or GATA) produce different length PCR products, each of which differ by one repeat unit
- For a CA repeat, PCR fragments will differ by 2 nucleotides
- So for locus ‘A’ above, there are 4 possible alleles across the individual #1 and #2
- Individual #1 has genotype 2,5 (allele A2 has 2 repeats – CACA, allele A5 has 5 repeats - CACACACACA)
- Individual #2 has genotype 3,4 (allele A3 has 3 repeats, allele A4 has 4 repeats)
- For a GATA repeat, PCR fragments will differ by 4 nucleotides
- The PCR fragments are then electrophoresed through an acrylamide gel and the different numbers of repeat units is represented by the difference in size of the PCR bands
- Primers for microsatellite analysis are often fluorescently tagged to allow multiple markers to be electrophoresed at the same time – the different PCR products can then be distinguished by colour. This is discussed in more detail on the next slide.
This process is still used for DNA analysis – e.g. paternity testing, forensics – as testing of 13 polymorphic microsatellite loci is generally sufficient to identify a specific individual
What is genotyping microsatellites used in?
- DNA fingerprinting from very small amounts of material
- Standard test uses 13 core loci making the likelihood of a chance match 1 in three trillion
- Paternity testing
- Linkage analysis for disease gene identification
What is Fluorescent genotyping?
The figure on the left shows an example of microsatellite genotyping using fluorescently-tagged primers for amplification by polymerase chain reaction (PCR).
• The different peaks represent different PCR products, with smaller fragments on the left and larger fragments on the right. Fragment sizes in bp are marked below each peak
• Because they are highly polymorphic (i.e. have many alleles), each microsatellite marker covers a range of fragment sizes (typical range spans 20-40bp)
• Each peak represents one allele: single peaks are homozygous; double peaks are heterozygous for the marker
• By using different coloured fluorescent tags (e.g. blue, green, yellow) and amplifying different sized fragments, PCR products can be pooled for multiplex analysis
What is SNP genotyping used for?
• Linkage analysis in families (affected vs unaffected relatives) homozygosity mapping (autosomal recessive) and mapping of Mendelian traits • GWAS in populations (unrelated cases vs matched controls) non-Mendelian disorders and multifactorial traits