Why is our ability to make accurate predictions on the role of genes in most diseases actually pretty low?
With a few exceptions, it is usually the case that genes do not confer the same risk of disease to all individuals.
Many genes interact (along with environment) to result in disease
We calculate an odds ration to assist in predicting risk
What is the rationale for finding disease genes?
- provide clues to pathogenic mechanisms
- new approaches to treatment
- inference of environmental risk factors
- disease prevention
In order for personalized medicine to work, what do geneticists have to discover first?
- risk genes for common diseases
- specific risk variants
- high-risk gene combinations
Once genetic the genetic risk of an individual is known, what is the next step in the application of personalized medicine?
once a person’s genes are known, an accurate DNA-based predictive diagnostic process based on their individual genetic risk can be conducted.
Once a diagnosis based on an individual’s genetic risk is made, what is the next step in personalized medicine?
Use the genetic diagnosis of disease susceptibilities and pharmacogenetic analysis of optimized drugs (optimal efficacy and specificity) to design an individual treatment and/or preventative.
What do Odds Ratios contribute to the landscape of personalized medicine?
Since the odds ratios of “most” genes are low, accurate prediction of genetic risk, once an individual’s genome is known, is actually pretty low.
What is Odds Ratio (OR)?
OR = (risk of disease with a gene variant) / (risk of disease without a gene variant)
Describe positional cloning a bit more.
First, PC is the way genes are really mapped today.
Positional cloning is a way to map a gene by focusing your attention on a specific region of the genome. You carry out a systematic analysis of all the genes in the suspected (disease causing) area and look for mutations and/or variants that contribute directly to a disease. (p. 208, blue box)
Explain the difference between Functional Cloning (FC) and Positional Cloning (PC) and what are they used for?
FC and PC are different pathways for finding a disease gene within the genome.
FC: Disease –> Function –> Gene –> Map
basically, you investigate genes that you already know (Dr. Spritz investigated hemoglobin and thalassemia)
PC: Disease –> Map –> Gene –> Function
basically, you now have to look for the gene and deduce its function because you don’t necessarily know the function up front.
What is a polymorphic DNA marker?
A marker can be a SNP or a CNV (of which microsatellites are a subset) at a known genomic position which we can “score”.
In Positional Cloning, polymorphic markers are used as surrogates for disease mutations (which are harder to pinpoint). Due to linkage disequlibrium, knowing which markers a person carries implies which disease genes they are carrying. Sort of a 2-for-1 deal
What is a microsatellite?
stretches of DNA containing units of 2, 3 or 4 nucleotides repeating
- are multi-allelic
- the number of repeats of the unit will vary from person to person allowing for DNA fingerprinting.
occur 1 / 30,000 bp’s roughly
What is a VNTR?
variable number tandem repeats = MINI-satellite
is a stretch of 100 to 1000 bps that then repeats in tandem along the dna
also varies person-to-person in the number of repeats of the 100-1000bp unit.
What is a SNP?
used for association
The occurence / allele frequencies differ in different ethnic groups / populations
Due to linkage, it is likely that a particular haplotype will be passed on to the next generation unchanged. So is useful for identifying if people are related (through many generations) vs. not related
What is a haplotype?
A combination of alleles (DNA sequence) at adjacent locations on a chromosome that are inherited together.
Can be one locus, several loci or an entire chromosome depending on the number of recombination events that have occurred between a given set of loci.
What did / does the 1000 Genome project do?
Sequenced 1000 genomes from different ethnic groups
catalog human genetic variations based on SNP’s
useful for analyzing sequence-based RARE VARIANTS that may be causal for common diseases
What is the international HapMap project?
The mapping of haplotypes of the human gene.
Why do close polymorphism genotypes carried on the same chromosome cluster into haplotype blocks and how large are these haplotype blocks?
Haplotypes blocks of 10-50kb are inherited as a cluster because recombination turns out to be not completely random
Within a haplotype block, SNP alleles are in ___________.
What does it mean for two (or more) SNP’s to be in linkage disequlibrium with one another?
Due to their “close proximity” within a haplotype block, SNP’s in LD with one another are generally co-inherited. Recombination within a haplotype block is uncommon.
How do LD blocks vary among ethnic groups? Specifically African vs. Caucasian or Asian popuations?
Since African populations are simply more ancient than Caucasians or Asian populations, the African LD blocks are smaller (about half the size).
The LD blocks that have been around longer have undergone more recombination events so eventually SNP’s will be separated and no longer co-inherited.
What is a CNV and how do they vary among individuals / ethnic groups?
copy number variant
bi-allelic, multi-allelic or unique
a common genomic deletion, 100’s to 10k’s bp long
occurance differs in different ethnic groups
individually rare (≤1%), collectively common
can be genes or not, can include genes
Not certain how often CNV’s are causal for human disease
What does it mean to “score” a genetic variation?
to Assay. to determine the presence or absence of a genetic variation.
Explain the difference between a Medelian and a Complex disease or trait.
Mendelian = single-gene; one gene is sufficient to cause disease phenotype
Complex = multi-gene; no one gene is sufficient to cause the disease.
What would you rather be doing right now?
Driving to Lake City to take a hike up in the Uncompahgre plateau.
I bet it’s cold up there right now!
What are the two broad categories of how one goes about finding a disease gene?
- Hypothesis driven approach
2. Hypothesis-free approach
What are the main categories of hypothesis driven approaches to finding a disease gene?
- Candidate gene sequencing
uses DNA sequencing to directly study a gene
- Candidate gene association studies
tests the gene - causal relationship indirectly
What are the hypotheses that a candidate gene sequencing study might depend on?
- Biological hypothesis
- positonal hypothesis
- a “hit” from a GWAS or other mapping method
basically: you think its a particular gene and you test the theory
What sort of diseases is a candidate gene sequencing study useful for? Not useful for and why?
successful for mendelian gene diseases
not successful for complex disorders (unless already positive in a GWAS study) because often normal individuals will be positive for the suspected pathogenic variant
What is a candidate gene association study? What does it depend on a priori?
Tests a gene or causal variant indirectly
- relies on a biological or positional hypothesis (candidate)
- most useful for common risk allelse with small to moderate effects (OR)
Unfortunately, most a priori biological hypotheses are wrong and there usually is some fatal flaw in the study that leads to a false positive
In a genetic association study, what sort of statistical analysis might be performed?
Fischer exact test
look for p < 0.05, 0.01 etc.
multiple testing corrections applied in multiple variant testing
In a genetic association study, what does real association imply?
NOT causation necessarily.
Does imply LD with a causal mutation
What are two fatal flaws in gene-by-gene case-control design?
What may lead to a false positive in such a study?
- Must include all tests that have been done in order to apply true multiple-testing corrections (and all tests may or may not be known to you / published) (e.g. the failures may not have been published!)
- Background genetic variation may vary among populations. So the variation you are observing may be ethnic variation instead of case vs. control variation!
* “Stratification” (occult pop. differences) may lead to false positives
What percent of published “confirmed” gene-by-gene case-control studies have turned out to be false? Why?
“comfirmed” means same results published 3 times independently
*Generally only “positive” studies are published and occult stratification of populations skews results
What is a hypothesis-free genetic disease study?
By doing a linkage study one can study genes indirectly. The hypothesis-free does not start out with a particular gene of interest. rather the gene is uncovered as a result of the linkage study.
What is the basis of study in a linkage study?
multiplex families, followed through many generations, in which there are many instances of a disease
best for mendelian traits (uncommon alleles with strong effects)
What type of study was used to find the neurofibromatosis gene?
a genetic linkage analysis within families that shared the neurofibromatosis disease.
In a linkage analysis test what is the statistical measure of choice?
Explain what the values of this measure mean?
LOD score (log of odds)
LOD ≥ 3 for Mendelian trait
LOD ≥ 3.3 for Polygenic trait
What is the fundamental question in a GWAS?
Look at the entire genome and ask what is different in people with the disease vs. not the disease…genome-wide.
Study the gene indirectly b/c don’t have a single candidate gene that you are focusing on a priori
How does a GWAS differ from a candidate gene case-control study?
Candidate Gene Case-Control = study gene directly, hypothesis dependent, usually wrong
GWAS = study gene indirectly (like association study), hypothesis independent, studies millions of SNP’s simultaneously