complex diseases Flashcards

Question

Twin studies

Answer 1

DZ (fraternal non identical, same as siblings) MZ= identical twins if a trait is genetic, it should always be the sam in MZ twins

Answer 2

Concordant twins* Both affected (+ / +) or unaffected ( - / - ) Discordant twins 1 affected, 1 unaffected (+ / -) ``` concordance ratio (r) = concordance in MZ/ concordance in DZ r> 1 genetics play a role ```

Answer 3

Limitations of twin studies: DZ twins can be of different sex, MZ twins may share more environmental factors, there are also epigenetics factors along life, X-chromosome inactivation, post-zygotic somatic mutations, etc

Answer 4

Two approaches: • Find adopted people who suffer from a particular disease known to run in families and ask whether it runs in their biological or adoptive family • Find affected parents whose children have been adopted away from the family and ask whether being adopted saved the children from the family disease Main obstacles: lack of information about the biological family, when adoption happened, intrauterine factors, and selective placement

Answer 5

property of loci to identify biological mechanism for transmission of a trait requires family pedigree use polymorphic markers

Answer 6

Association is a property of alleles To identify an association between an allele and a phenotype Fine mapping (<1cM) Case-control or family approach Usually bi-allelic SNPs

Answer 7

affected sibling pair When affected siblings share a chromosome region more or less often than expected by chance, then that region is likely involved in causing the disease

Answer 8

for risk ratio of 4 (high) you would need a lot of pairs of families to do a linkage analysis anything less than 4 and the number of families increased drastically

Answer 9

1991: Linkage analysis identified the proximal long arm of chromosome 19 • Apoliprotein E (APOE) • ε2 decreases risk • ε4 increases risk • 15-25% of the population carry 1 copy, 2-4% carry 2 copies • ε4 drives earlier and more abundant amyloid pathology in the brains of carriers

Answer 10

1. Functionally important DNA sequences are the minority of our genome. 2. Genetic redundancy: nucleotide substitutions that don’t change amino acid, or gene duplication. 3. Functionally unimportant amino acid or nucleotide positions within proteins or within functionally important noncoding sequences.

Answer 11

Chromosomal segments can exist as a block that is only rarely broken up by recombination. - because theyre so close together they do not recombinate • Linkage disequilibrium (LD): the nonrandom association of alleles of different loci. some combinations of alleles are favoured

Answer 12

frequency of haplotype (AB,Ab,aB,ab) - the frequency of the individual alleles if no LD = frequency of haplotype = frequency of individual alleles multiplied together if d' = 1 complete linkage (no recombination) d'>0.33 threshold to determine LD

Answer 13

sets of nearby SNPs on the same chromosome that are inherited as a block. Haplotype blocks represent ancestral chromosome segments that have been transmitted intact through many generations - darker the blocks, the stronger the LD the older the generation the SNPs were generated and transmitted together, the more consistent the haploid blocks are going to be

Answer 14

similar ancestry, early on difference in mutations, then different haplotypes - the frequency of haplotypes depend on the population

Answer 15

concentrated in 1-2kb hotspots we have ~30,000 hotspots every 50-100kb with low LD between blocks we have recombination hotspots hotspots due to epigenetic histone methylation marker

Answer 16

reduce the number of SNPs required to examine the entire genome for association with a phenotype if SNPs are in LD they represent all the snps in that block by taking a few tag SNPs we can identify the genotype of other snps around them

Answer 17

Phasing: the process of inferring haplotypes from genotype data, assigning alleles to maternal or paternal chromosomes if on same chromosome = cis (phased) on different = trans (unphased)

Answer 18

Using knowledge of linkage disequilibrium to fill in genotypes at loci that were not part of the original experiment.

Answer 19

lets say you got 6 SNPS - lets assume 1 and 2 are linked (i.e. d' = 1) - 3 and 5 linked - 6 and 4 linked we can just use 1, 3, 6 for single SNP tests - of lets say A from 1 and G from 3 always go together we can infer 6

Answer 20

Looks for co-occurrence (association) of alleles and phenotypes we use candidate gene studies (individual genes, require biological insight) and GWAS

Answer 21

Looks for co-occurrence (association) of alleles and phenotypes, comparing cases and controls e.g. we have two alleles T and C in cases 62% have allele C and 38% have allele T in control 49% have C and 51% have T using odds ratio (axd/bxc) calculate association

Answer 22

rare mutation in this gene strong correlation to high cardiovascular disease. used linkage analysis followed by animal studies when mutation, it binds to LDL receptor leading to lysosomal degradation of the receptor the receptor cant bind to LDL --> high LDL- leads to clogging of arteries trials to lower LDL cholesterol by targeting mutation with siRNA leads to mutation mRNA degredation

Answer 23

- Inadequate matching of controls (not accounting for other factor) • Insufficient correction for multiple testing (bonferroni) • Underpowered studies leading to lack of replication

Answer 24

* Direct causation * Epistatic effect * Population stratification • Linkage disequilibrium

Answer 25

new biological insights -> clinical advances - therapeutic targets - biomarkers - prevention

Answer 26

few SNPs + hypothesis | millions of SNPs and no hypothesis

Answer 27

``` A hypothesis-free method • Uses large sample sizes, or cases versus controls • Identify regions of the human genome that are associated with a phenotype • Based on allele frequencies at hundreds of thousands of tag-SNPs • Association is usually confirmed through replication in independent datasets and/or GWAS meta-analyses • Requires fine mapping through linkage disequilibrium to identify specific variants ```

Answer 28

``` SNP arrays vs WGS - looks into tagSNP vs looking into the sequence of the whole genome - inexpensive vs expensive - reliable vs less accurate - ```

Answer 29

- data collection - genotype (via SNP arrays and NGS) - quality control (look into different populations) - imputation (tag SNPs) - association testing (manhattan plot) - meta-analysis or replication

Answer 30

It is dependent on a number of important factors, such as: • (un)relatedness of individuals (if they share DNA there will be an unwanted association) ``` • genetic architecture (quality control) • population stratification (quality control) • genetic model ```

Answer 31

f we assume P<0.05 is significant: In 100 comparisons, 5 associations will be a false positive • Need to use a multiple comparison adjustment (e.g. Bonferroni) • GWAS, we do 1 million tests (or more!) 1,000,000 x 0.05 = 50,000 false positives Estimated that P (for most GWAS) should be < 5 x 10-8 for common variants with MAF >5% and LD r2=0.8

Answer 32

0. 05 dived by number of comparisons made. | i. e. 1 million tests = 0.05/1,000,000

Answer 33

threshold red line (normally 5 x 10^-8) y-axis - adjusted p value threshold x-axis - chromosome number each dot represent a SNP based on its p value for association the higher the p-value on the plot, potentially the highest the significance for every dot, there is a SNO on a chromosome associated with the disease of interest

Answer 34

sex chromosomes | its now starting to improve

Answer 35

the monogenic alleles are few but large impact | the more complex the smaller but greater number of alleles

Answer 36

uk biobank - 500,000 | there are many banks in Europe and America and Asia. few in Africa and other countries. demographic problem

Answer 37

top 697 variants explain 20% of heritability | top 10k variants explain 30% of Vp

Answer 38

heritability estimated to 30-70%

Answer 39

1. due to rare variants with BIG effect 2. Due to gene-gene and gene-environment interactions 3. Due to epigenetic effects 4. no missing heritability; family studies overestimate heritability 5. GWAS underestimates heritability due to non reliable tag-SNP detecting variants 6. Much heritability due to common variants with very small effects

Answer 40

- Whole-genome sequencing of large cohorts for rare. Uncommon variants Interpreting and role of risk of SNPS

Answer 41

can identify individuals at risk of common complex diseases

Answer 42

- Single value estimate of an individuals genetic liability to a phenotype - Sum of the genome-wide genotypes, weighted by genotype effect size (odds ratio) derived from GWAS summary statistic data

Answer 43

GWAS - many variants with small effects - low penetrance Mendelian - high penetrance - few variants large effect the missing alleles could be the intermediate penetrance

Answer 44

we identify SNPs with GWAS associated with disease estimate SNP based heritability and build candidate predictors build polygenic risk scores composite score for personalised risk prediction

Answer 45

``` they identified 4 alleles on 4 loci with different effect A- +1.5 C - -0.5 T - +2.0 A - -1.5 ``` individual 1 has AT CG TT CC 1.5 (1x A) - 0.5(1x C) + 4.0(2x T) - 0.0 (0 x A0 = 5.0

Answer 46

For risk calculation in European populations = LIMITATION • Conditions with proven preventative measures • The risk of disease outweighs the psychological impact of knowing you are at high genetic risk of disease

Answer 47

causal variant genotyped = direct association | causal variant in LD with other genotyped variants = indirect association

Answer 48

Variants are merely associated with a trait We can use further genomic analysis tools to determine: • Coding vs regulatory variants • Fine mapping • Gene expression Future in vitro, animal studies, and clinical trials

complex diseases Flashcards

(74 cards)