Module 2 - content Flashcards
(48 cards)
biallelic marker
3 possible genotypes
Pleitropy
one gene influences more than one or unrelated phenotypes
Penetrance
percentage of people who carry allele and express phenotype
Linkage disequilibrium
non-random assortment of alleles at 2 or more loci, stronger = decreased recombination
Haplotype
specific combination of alleles occurring on the same chromosomes
Heritability
proportion of phenotypic variance in a population that is explained by genetic variations
Genetic variation associated with disease
can be the sole cause of disease or it can increase the risk of common diseases
GWAS aim
detect associations between genetic variants and phenotypes in a population
GWAS goal
- better understanding of biology
- develop treatments or interventions
GWAS reality
often identifies associated marker rather than causal variants
Steps of a GWAS
- cohort selection
- data collection
- genotyping
- quality control
- imputation
- association testing
- meta-analysis
- replication
- post-GWAS analysis
GWAS cohort selection factors
- population based GWAS
- Family based GWAS
- need to consider age, sex, ethnicity, country, school etc
Population based GWAS selection
- genotyping and phenotyping done from individuals which are randomly chosen from the population
- usually case-control studies = presence or absence of a certain phenotype
- active recruitment of cases and controls can increase statistical power
- case and control should be genotyped together on the same chip to increase quality and statistical power
Family based GWAS selection
- needs a greater sample size to achieve the same statistical power of unrelated individuals
- avoids population stratification
Data collection
- can be from cohorts or biobanks
- usually need a large sample size for good statistical power
- biobanks = data from thousands of genotyped individuals who have been phenotyped via questionnaires etc
UK biobank
- exome and whole genome sequencing
- investigates the contributions of genetic variation and environmental exposure to the development of disease
- participants are healthier, healthier and more educated than the general population
- European ethnicity
China kadorie biobank
- investigates the genetic and enviro causes of common chronic diseases in the chine population
- have a custom genotyping platform
Genotyping techniques
- microarrays
- next generation sequencing for whole genome or whole exome sequencing
- custom SNP arrays
- commercial SNP arrays
Custom SNP arrays
- expensive
- allows increased genomic coverages for populations other than European
Commercial SNP arrays
- cheaper as they are already made
- usually only European so doesnt include other ethnicities
Quality control
- removal of rare variants
- ensure phenotypes are well matched with genetic data (sex vs chromosomes)
Imputation
- allows the evaluation genetic markers which are not directly genotyped
- increases the power of GWAS as it includes SNPs which may be poorly targeted on chips thus may initially be mised
- done using a sequenced haplotype reference panel such as the 1000 genomes project
Limitation of sequenced haplotype reference panels
excludes indigenous populations
Association testing
- linear or logistic regression models
- includes covariates such as age, sex and ancestry
- include a random effect term to increase statistical power
- consider linkage disequilibrium
- account for false discovery