Finding Disease Genes Flashcards Preview

M2M2 > Finding Disease Genes > Flashcards

Flashcards in Finding Disease Genes Deck (71):
1

Candidate Gene Association Study

Hypothesis driven approach

Uses markers to test gene/causal variant indirectly

Depends on a priori biological or positional hypothesis (almost always wrong!)

Fatal flaws lead to false positives

2

Genome Wide Association Study

Hypothesis Free approach

Rather than look gene by gene (candidate gene association study) we could do whole genome at one time!

Search for SNPs with significantly different allele frequencies in cases verse controls

3

Genetic linkage study

Hypothesis free approach

Search for genome disproportionately coinherited along with disease in multiplex families

Assumes affected relatives within a family share disease susceptiblity genes "identical by descent"

4

Exome/Genome Sequencing Study

Sequence
Compare to reference
Look at common anomalies

5

Single gene sequencing

Sequence hypothesized gene

Most hypotheses wrong

6

What does genome mapping require?

Polymorphic DNA markers

7

Do we sequence the entire genome?

No... still too expensive

8

What do polymorphic DNA markers do for us?

They provide "Sign posts" where we can look at differences

9

Polymorphic DNA makers are surrogates for what?

Potential disease mutations;

10

What are three commonly used marker types?

Microsatellites
SNPs
CNVs

11

Gene Mapping: what are physical maps?

Maps that tell us absolute positions - this is here and this is here

12

Gene Mapping: what are genetic maps?

Relative maps based on recombination - across a whole population roughly how far apart are these two things/sequences from each other

13

Microsatellites

Simply a repeat sequence in the genome for which the copy number varies
Simple sequence repeats
Used in forensics
Multi-allelic

14

Single nucleotide polymorphisms

Bi-allelic
Used for association studies

Occurrence/allele frequencies differ in ethnic groups/populations

SNPs occur in local context (haplotype) of surrounding SNPs

15

How frequent are SNP?

1/50-300bp

16

SNP haplotypes

Recombination breaks macro-pattern of polymorphic genotypes on the same chromosome into blocks in which SNP alleles are in linkage disequilibrium (makers within blocks tend to be co-inherited because recombination within blocks is uncommon)

17

If you genotype enough SNPs to identify a haplotype you can impute other variation that wasn't genotyped and use this to infer ?

causal variation took place in this haplotype, even though SNP may not be causal variant

18

Copy Number variants

Common genomic deletions
Bi-allelic
Multi-allelic
Unique
Most not causal for human disease

19

If we have a common disease allele that has a small effect, what studies are best suited to hunt for the disease gene?

Association
Candidate gene or GWAS

20

If we have a rare disease allele that has a large effect, which studies are best suite to hunt for the disease gene?

Linkage
Sequencing
(track genes through families using linkage)

21

To track things that are common but have relatively little effect we use which type of studies?

Association

22

To track big effect genes that are relatively rare we use which kind of study?

Linkage in families

23

Hypothesis Driven Studies

Candidate DNA Sequencing
Candidate Gene association

24

Candidate gene DNA sequencing
Where do we come up with our candidate?

biological or positional
"hit" from GWAS or other mapping method

25

When do Candidate DNA studies work?

Single gene Mendelian diseases

26

Candidate DNA sequencing, are most hypotheses correct?

NO! most are wrong!

27

Which type of study uses markers to test gene/causal variant indirectly?

Candidate gene association studies

28

Which genetic study is the most common?

Candidate gene association

29

What do candidate gene association studies depend on?

A prior hypothesis

30

What are fatal flaws of candidate gene association studies and what do they lead to?

1. Multiple-testing correction impossible
2. Ethnically matched impossible
False positives!

31

Concept behind Candidate Gene Association study?

Causal disease variation in candidate gene is tagged by local haplotype of polymorphic DNA markers in Linkage Disequilibrium

Depends on Linkage disequilibrium

32

Candidate gene association studies depend on linkage disequilibrium - in that?

DNA sequence variations close together on the same piece of DNA will tend to not be separated by recombination over long periods and so will be non-randomly co-inherited

33

Candidate gene association studies - approach
What kind of study design?

Case control

34

Candidate gene association studies - (2)

1. genotype marker in candidate gene in cases nad controls
2. compare allele frequencies in cases and controls

35

G.A.S.
Study size?

Hundreds

36

G.A.S. stats?

Uses simply stats (chi sqaure, Fisher exact) p

37

Genetic association studies - because we test multiple variants - what must we do?

We must apply multiple-testing correction

38

G.A.S.
What does an association imply?

Not causation but does imply at least linkage disequilibrium with causal mutation

39

What is the issue with multiple testing correction in G.A.S.

Have to take into account every variant of every study ever done - unrealistic
Take that number and divide you P value by the number of variants to get new significance value, which will be much much lower.

40

G.A.S.
2 Fatal Flaws...

1. True multiple testing correction must include all tests, even those done by others and perhaps never published

2. Must ethnically match cases and control; otherwise, observed differences in allele frequencies may reflect different genetic backgrounds of cases and control, not true disease association - not possible to achieve

41

Why can't we ethnically match cases and controls in G.A.S.?

Because even in homogenous population, occult population differences (stratification) can lead to false positives

42

What percent of published (3x confirmed) genetic association studies ultimately appear to be false positives due to stratification and publication bias?

96%

43

Is genetic linkage analysis hypothesis free?

Yes!
Search genome for segments disproportionately co-inherited along with disease in "multiplex" families

44

What is the underlying assumption in genetic linkage analyses?

Affected relatives within a family share disease susceptiblity genes
"identical by descent"

45

What traits are best suited for genetic linkage analysis?

Mendelian (uncommon alleles with strong effects)

46

Genetic linkage analysis for complex traits ?

Less powerful

47

What can be said to be a search across the genome for marker(S) that co-segregate with disease in families?

Genetic linkage analysis

48

What does genetic linkage depend on?

Principle depends on recombination - Loci close to each other (marker and gene) on a chromosome tend not to be separated by recombination vs. loci far apart

49

What is the unit of genetic linkage/recombination?

centiMorgan (cM)
1 cM = 1% recombination between two loci per meiosis

50

What is the statistical measure or linkage in genetic linkage analysis?

Log of odds score

51

LOD =

Log10 (likelihood of data if loci linked at _cM / likelihood of data if loci unlinked)

52

What is the significance level for genetic linkage analysis for LOD score?

> or equal to 3 is considered proof of linkage/gene localization

53

How do we localize a gene using genetic linkage analysis?

We can follow ancestral haplotypes of linked marker alleles in each family

Through the generations recombination evens prune the haplotype --> Localizing the gene

54

What are you looking for in genetic linkage analysis?

You are looking for a region of the genome where there is something that seems to be shared among affected relatives that you assume has been inherited from a common ancestor given the family structure

55

What kind of studies are GWAS?

Case-control

56

What do GWAS do?

Test hundreds of thousands / millions of markers (SNPs) across the entire genome

57

What are we looking for in GWAS?

SNPs with significantly different allele frequencies in cases vs. controls

58

Do we still need to match cases and controls ethnically?

Yes, and we are met with the same stratification problem, however, now we can accurately measure and correct for population stratification (whole genome)

59

GWAS are we still faced with multiple testing correction issue?

NO! we know the number of tests performed genomewide; so we can perform appropriate multiple testing corrections (usually assume one million tests, so p

60

Because we have a huge multiple-testing correction in GWAS, how big must our study be?

Usually at least 1000 cases and controls

61

What happens if we find a significant association in a GWAS study?

We require confirmation by independent replication by follow up association study of specific SNPs

62

When are GWAS most effective?

Common alleles with moderat effect sizes (ORs) 1.5 to 1.15

63

What limits GWAS?

Sample size

64

What is the hope of GWAS?

That we will be able to determine genetic architecture of disease - infer th ebiological pathway - and then separate/recategorize disease based on pathway it follows - which could significantly increase our odds ratio - because it would no longer be diluted by irrelevant pathways that cause same disease

We are trying to tease out genes specific to pathways

65

Are most GWAS findings coding?

No, most are regulatory in nature, which is good because it may be easier to target and treat

66

Which investigation combine hypothesis based and hypothesis free approaches?

Deep re-sequencing

67

What is Deep re-sequencing?

High throughput DNA sequencing
- of Biological candidate genes
- from GWAS signlas
- Full genome or exome

68

What is a problem of Deep Sequencing?

Difficult to distinguish potentially causal variants from non-pathologic
- Prioritize for follow-up functional analysis

Variants of unknown significance

69

Exome Genome Sequencing
How it works?

Pull down genome
Sequence
Reference
What's different

70

Is noise an issue with Exome/Genome Sequencing?

Yes! There is a lot of noise

71

Exome/Genome Sequencing Filtering Schemes

Make assumption that disease populace share similar things -
Sequence effected patients / effected family

Look for something rare

Find something? Look at catalogs

Do LOD score