Association Analysis Flashcards

1
Q

What is genetic association?

A

The presence of a variant allele at a higher frequency in unrelated subjects with a particular disease (cases), compared to those that do not have the disease (controls).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an allele?

A

One form of a variant in the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a locus?

A

A position in the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a genotype?

A

Both alleles at a locus e.g. locus 1: 1,4 and Locus 2: 1,1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a haplotype?

A

This is the order of alleles along a chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why are case-control studies used?

A
  • Cases are subjects with the disease of interest e.g. obesity, schizophrenia, hypertension.
  • Defintion of the disease must be applied in a rigorous and consistent way
  • Controsl must be as well-matched as possible for non-disease traits such as age, sex, ethnicity, location etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is case-control association?

A

Cases: gene variant is associated with disease

versus controls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe how the case control study works

A

There are two groups:

  • Affected cases
  • Unaffected controls

Then measure the genetic loci of interest

Statistical analysis to determine which genetic loci correlate with disease

Identify genomic region associated with disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is needed in a case-control genetic study?

A
  • Large number of well-defined cases
  • Equal numbers of matched controls
  • Reliable genotyping technology (SNP array)
  • Standard statistical analysis (PLINK)
  • Positive associations should be replaced
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the ideal genetic marker?

A
  • Polymorphic
  • Randomly distributed across the genome
  • Fixed location in genome
  • Frequent in genome
  • Frequent in population
  • Stable with time
  • Easy to assay (genotype)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a SNP?

A
  • Generated by mismatch repair during mitosis
  • Common in the genome which is about 1/300 nucleotides
  • About 12 million common SNPs identified in human genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do SNPs arise?

A
  • DNA strands are split and they undergo mitosis.
  • One DNA strand replicates
  • The other DNA strand replicates but there is a mismatch.
  • Usually it would be repaired by the mismatch repair system.
  • Rather than the mismatch repair system replacing the mismatch, it replaces the other base on the original strand.
  • This because the SNP; T/C SNP.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where are SNPs located?

A

In the Gene coding region:

  • No amino acid change (synonymous)
  • Amino acid change (non-synonymous)
  • New stop codon (nonsense)

In the Gene non-coding region:

  • Promoter - mRNA and protein level changed
  • Terminator - mRNA and protein level changed
  • Splice site - altered mRNA, altered protein

In the intergenic region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the dbSNP?

A

It is an online database at NCBI of single nucleotide polymorphisms (SNPs) and multiple small-scale variations that include insertions/deletions, microsatellites, and non-polymorphic variants.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the minor allele?

A

It is the less common alllele. Each allele has a frequency in the general population and the minor allele has a MAF.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the Minor AF + Major AF add up to?

A

1

17
Q

What is a genome wide association study (GWAS)?

A

Use markers across the whole genome

18
Q

What do SNP microarrays do?

A
  • Look for association between disease and each marker - chi-squared test
  • This has resulted in the detection of large numbers of disease-associated genes
19
Q

How is GWAS data presented?

A

It is presented as a single graph called a Manhattan plot.

20
Q

What is the X-axis and Y-axis in a Manhattan plot?

A
  • X-axis is position of the SNP on the chromosome

- Y-axis is -log10 (p-value) on the chromosome

21
Q

What is a Manhattan plot?

A

A simple way to visualise the markers across the genome associated with the disease.

22
Q

What is the WTCCC?

A

It is the Wellcome Trust Case Control Consortium

  • Contains 1958 Birth Cohort and the UK blood service as the controls.
  • Looks at cases of CAD, Type 1 and 2 diabetes, hypertension, rheumatoid arthritis, Crohn’s disease and bipolar disorder
23
Q

What do the peaks indicate in manhatten plots?

A

Significant p-values of p <5x10-5

24
Q

What are some misconceptions of the peaks in GWAS results?

A
  • The peak does not identify the gene causing the disease

- The peak identifies the genomic region associated with the disease

25
Q

What is another graph that can be used to show GWAS results?

A

Regional Association plot

26
Q

Advantages and Disadvantages of meta-analysis

A
  • Difficult to do very large studies (>10K cases)
  • Easier to combine smaller studies
  • Pre-experiment - consortium
  • Post-experiment - meta-analysis
  • Meta-analysis allows the statistical combination of results from multiple studies
27
Q

What are the medical complications of obesity?

A
  • Pulmonary disease
  • Idiopathic intracranial hypertension
  • Stroke
  • Cataracts
  • Coronary heart disease
  • Severe pancreatitis
  • Diabetes
  • Cancer
  • Phlebitis
  • Gout
  • Osteoarthritits
  • Gynecologic abnormaltiies
  • Gall bladder disease
  • Nonalcoholic fatty liver disease
28
Q

What studies are used to investigate the genetic components of common obestiy?

A
  • Twin studies
  • Adoption studies
  • Family Studies
29
Q

What did an obestiy GWAS show?

A

There were genes associated with waist size, extremes, fat mass and BMI and they all overlapped.

30
Q

What is the problem with GWAS?

A
  • It has identified associations that are statistically strong and reproducible but their contribution to the genetic component of disease is estimated to be low (less than 5%)
  • For example disease may infact be caused by other things usch as:
    • Common SNPs of small effect
    • Rare SNPs
    • Copy Number Variation
    • Epigenetic variation
    • Heritability is overestimated