How to handle the data from studies of complex disease Flashcards

1
Q

What does parametric linkage analysis determine?

A

Genetic determinants of disease.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How are parametric linkage analysis set up?

A

Ascertain (a small set of) large families (pedigrees) each containing a number of affected individuals

Use a genotyping technique to measure the alleles (genotype) at one or more loci, in as many individuals as are available

Examine the co-segregation (co-transmission) of disease phenotype and alleles at the genetic marker loci

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is genetic distance measured in?

A

Morgans (M) or centimorgan (cM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the connection between Morgans and recombination?

A

Recombination between alleles at two loci closely related to physical distance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What symbol represents the probability of recombination between loci?

A

θ (Theta)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the ranges of θ?

A

0 to 0.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the value of θ when the loci lies close?

A

θ is small (≈0) and the loci are said to be completely linked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the value of θ when the loci are further apart?

A

θ approaches 0.5

Loci are said to be unlinked (alleles at the two loci are transmitted independently)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Likelihood ratio test?

A

Using a computer program to calculate the likelihood of observed genotype and phenotype data in a set of families.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the likelihood ratio depend on?

A

How well the observations match the assumed model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a LOD score?

A

Testing for linkage using likelihood ratio test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the LOD score test for?

A

Tests the null hypothesis that the disease locus lies far away from the genotyped marker locus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the null hypothesis in a LOD score test?

A

θ = 0.5 (unlinked)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to calculate parametric linkage analysis (likelihood ratio)?

A

LRmax = L(θˆ) / L(0.5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is L(θˆ)?

A

The value of θ that maximises the likelihood (makes the data ‘most likely’ to have occurred).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to calculate the LOD score based on the likelihood ratio?

A

The log base 10 of the likelihood ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is considered a “Convincing” LOD score as evidence for linkage?

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why is 3 a “Convincing” LOD score?

A

Corresponds to a likelihood ratio of 1000

Data is 1000 times more likely under the alternative hypothesis than under the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How do we find the max LOD score?

A

Multipoint analysis.

We calculate the likelihood (or likelihood ratio) at different values of θ.

20
Q

How is multipoint analysis carried out in theory?

A

Use a set of marker loci whose genetic map positions are known, and assess the evidence
for the disease locus lying at different positions along the genetic map.

21
Q

What does the LOD score at each position in a multipoint analysis correspond to?

A

The likelihood of the data assuming the disease
locus lies at that position divided by the likelihood of the data assuming the disease locus lies far away.

22
Q

How is multipoint analysis carried out in practice?

A

Computer program.

23
Q

What kinds of programmes carry out multipoint analysis?

A

Merlin (smallish pedigrees, exact calculation)

SIMWALK or MORGAN (larger pedigrees, approximate calculation)

24
Q

What happens once you have your LOD score graph?

A

You keep going smaller and smaller till you can pin point.

25
What happens when a disease is heterogenetic?
Only a proportion (α) of families assumed to show linkage.
26
What is HLOD score?
When a disease is heterogenetic, α is estimated along with θ by maximum likelihood
27
How successful have parametric linkage analysis studies been for monogenic disease?
Highly successful
28
How successful have parametric linkage analysis studies been for complex disease?
Less successful
29
What is the purpose of non-parametric linkage analysis?
Tries to determine whether members of a family with “similar” trait values tend to share genetic material in common from their common ancestors.
29
What are the aims of association studies?
Directly examine the association (correlation) between alleles present at a genetic locus and a phenotype of interest.
29
What is the most popular type of association studies?
Case/control study (unrelated individuals)
30
How are association studies set up?
Collect sample of affected individuals (cases) and unaffected individuals (controls) Examine the correlation between alleles present at a genetic locus and presence/absence of disease by comparing the distribution of genotypes in affected individuals with that seen in controls.
31
Why are parametric linkage analysis more difficult for heterogenic diseases?
Can't assume all family have the same cause and therefore the same gene locus.
32
How to test for association (correlation) between genotype and presence/absence of disease when doing case/control studies?
Using standard χ2 test for independence on 2 df.
33
What is the χ2 test for independence?
(Observed −Expected )^2 / Expected + p value
34
What is the more sophisticated to preform an association test?
Rearrange your data to test specifically for dominant or recessive effects. Use linear regression for quantitative outcomes Use an x variable defined according to genotype
35
What is the null hypothesis for linear regression of an association test?
Slope = 0
36
What are FBATs?
Family-based association tests
37
What are TDT?
The transmission disequilibrium test.
38
What are LMMs?
Linear mixed models
39
How to analyse family based data?
Use family-based association tests (FBATs)?
40
What kinds of family-based association tests (FBATs) are there?
The transmission disequilibrium test (TDT) Linear mixed models (LMMs)
41
What kinds of software analyse GWAS?
PLINK, SNPTEST, GCTA
42
Why are stringiest significance levels required during GWAS?
To overcome the multiple testing problem incurred when we test many SNPs throughout the genome.
43
What quality control is required when using GWAS?
Discard samples (people) deemed unreliable Discard data from SNPs deemed unreliable
44
What could make a sample be deemed unreliable?
Low genotype call rates (unsuccessful genotyping) Excess heterozygosity (mix of samples) Gender and Ethincity
45
What could make a SNPs be deemed unreliable?
On basis of genotype call rates, Mendelian mis inheritances, Hardy-Weinberg disequilibrium Exclude SNPs with low minor allele frequency (MAF), these are hard to compare to the control.