GENETIC ASSOCIATION STUDIES Flashcards
(59 cards)
how did they find disease causing genes in the pre genomic era
they used pedigree diagrams of 3 generations or more
how did they find disease causing genes in the post genomic era
they did GWAS on the entire genome
findings from the first human genome generated
- it was generated from 13 people
- found that there are 3 billion base pairs in the haploid genome
- around 25000 genes exist
- around 18 million SNPs exist and each person has around 3.3 million SNPs
ie; extensive allelic variation between members of the same species
what are the 4 different types of genetic variants and their frequencies
- SNPs -18 million of them; occur every 1kb
- InDels- 200000 of them; occur every 10kb
- SSRs- 100000 of them; occur every 30kb
- CNVs-8600 of them; occur every 3Mb
explain what a SNP is
it is a single nucleotide polymorphism ie; change in nucleotide base from the wild type to a variant type
- more than 1% of the population must have the alternate nucleotide at this position for it to be considered a SNP
-SNPs can be homozygous or heterozygous because we have 2 chromosomes ie; 2 copies of DNA
-always take the top nucleotide on each chromosome
-the top nucleotide on chromosome 1 is how you name the SNP
how do you figure out what the wild type SNP is
compare the genome to that of the chimp because chimps have the wild type allele and any other SNP must have arisen after divergence of the 2 species
how do we genotype SNPs
use GWAS
-this requires 1000s of individuals genomes to find SNPs with high association to a disease
-GWAS is chip based ie microarray based
-the loci on the chip are SNPs
explain how a GWAS works
- generate oligonucleotides from the DNA you wish to sequence
- stick these oligonucleotides to the chip
- then fragment your DNA of interest and wash the fragments over the chip so that fragments bind to their complimentary oligo probe
- note that the oligo probe sequence ends in length just before the base you wish to sequences and so a fluorescently labelled nucleotide can be added to the oligo that is complimentary to the fragment you wish to genotype
-lit up using a light source
-shows you if that person is hetero/homozygous at that SNP
in what disease is a SSR found
in huntingtons disease
-the SSR is a triplet repeat oof CAG found in the coding region of the HD gene
-a person who has less than 34 CAG repeats in this region of the gene will have a normal allele
-a person with more than 42 CAG repeats in this region of the gene will have HD
explain what incomplete penetrance is
not every person who has the disease genotype will express the disease phenotype
explain what genetic heterogeneity is
this is when different disease genotypes are responsible for the same disease in different families
explain what polygenic determination is
this is when mutant alleles at more than one locus influence disease expression in one person
explain what complex inheritance is
this is when 2 unlinked disease loci are inherited together to predispose someone to the disease
eg: breast cancer
-inheritance of the 2 unlinked disease loci BRCA1 and BRCA2 can predispose women to breast cancer
-incompletely penetrable disease because not all women have mutant BRCA1/BRCA2 have breast cancer
-note that these 2 genes transcribe tumour suppressor proteins
what were the aims of the kirov paper
- to find the novel CNVs associated with SZ
- to compare SZ CNVs with CNVs in other diseases
- compare novel vs inherited CNVs
- understand the pathophysiology behind SZ
What were the cohorts used in kirov study
- bulgarian case only parent proband trios
- iceland control only parent proband trios(note that this group have a controlled gene pool because they are geographically isolated)
- data from an ASD case-control study
- data from publically available datasets
what is gene ontology
this is the process of determining the function of genes in 3 aspects
1. molecular function
2.cellular components
3. biological processes
-look specifically at the pathways in which these genes are involved ie; GSEA
-the GSEA finds pathways that are enriched in the disease of interest
define de novo
a genetic variant arises for the first time in a family due to a mutation in one of the germ cells from either parent
what were the results of the kirov paper
- they found more rare/de novo CNVs occuring in SCZ than in controls
- they found 34 de novo CNVs associated with SCZ
- 8 of these 34 de novo CNVs were found at known SCZ loci but the rest were found at new loci- not yet found to be associated with SCZ
- some of the CNVs they found in SCZ were also found to be associated with other disorders
- they found some CNVs were located in genes that function in the post synaptic density pathway
what is a haplotype
this is a group of SNPs located on a single chromatid that are associated statistically ie; inherited together
what is a tag SNP
a tag SNP is a single representative SNP that represents a group of SNPs ie; a haplotype
-high linkage disequilibirum to the other SNPs
what is a manhatten plot
this plot shows the position of all the SNPs across all the chromosomes and their probability of association with a disease
-the higher up it sits; the greater the probability that SNP is associated with the disease
-the SNPs sitting at the bottom have no association with the disease
-in the areas that we see the large peaks; these SNPs all the give off the same signal and so are associated with one another ie; represent a haplotype. this peaks indicates that this haplotype in this genomic region has a high association with the disease
how do we deal with multiple testing
- need replication
- make note of the false discovery rate
- use statistical correction
how do we deal with multiple testing
- need replication
- make note of the false discovery rate
- use statistical correction
how do we assess a GWAS publication
- make sure the results have been independently replicated using different methods/sample groups
2.ensure a big enough sample size has been used - make sure quality control has been done
- check for any confounders ie; variables besides the disease that may differ between the control and case groups