association and linkage Flashcards

1
Q

identifying human disease genes

A

locate the genetic variants presumed to be biologically causal for a disease
genetic variants in dna sequences (insertions, deletions, rflps + snps) may have difect impact on disease and phenotypic differences (direct association)
genetic variations in dna sequences may be indirectly associated- allele itself is not involved but a nearby correlated marker effects phenotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how genetic variation is maintained and studied

A

mutations/ independent assortment and recombination/crossing over cause variation and evolution
agents of evolutionary change- mutation, non random mating, gene flow, finite population size (genetic drift) + natural selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

genetic variation

A

can be direct or indirect
direct means that the associated genetic variation is functional, thought to be affecting a biological mechanism and causing the phenotype
indirect associations when the allele itself is not involved but a nearby correlated marker effects the phenotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

allele frequency, populations and gene pools

A

reproduction and evolution
population= localised group of interbreeding individuals which produce fertile offspring
gene pool is collection of alleles in the population
allele frequency = how common that allele is in a population (allele of intersetes/total no. of copies of allele at that locus in population)
a locus is fixed if all individuals in a ppulation are homozygois for the same allele

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

common dna polymorphisms

A

SNPs (1bp)- allele example A/G
repeating elements- STR- 2-13bp
interspersed polymorphisms (insertion + deletions, indels eg Alu)- allele example I/D , +/-

dna polymorphisms are analysed by changes in the nucleotide sequence or size- alleic identity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

spectrum of disease allele effects

A

disease associations are often conceptualised in two dimensions: allele frequency and effect size
highly penetrant alleles for mendelian disorders are extremely rare with large effect size, but most gwas findings are associations of common SNPs with small effect sizes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

gene mapping methods

A

linkage analysis- follows meiotic events through families for co-segregation of disease and particular genetic variants- based on recombinant frequency
associations analysis - detect association between genetic variants and disease across families - based on linkage disequelibrium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

linkage studies

A

aim to identify a marker that co-segregates with the gene of interest so can be used to track gene within a family without actually knowing the mutation
use the inheritence of markers within families to idenify chromosomal regions where disease genes may lie

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

pedigree analysis

A

can look for mutations within pedigrees so can see which allele it is linked to

mendels 2nd law
segregation of alleles for one gene occurs independently to that of any pthergene
alleles for different genes are inherently independent of each other
but it is not always accurate and often violated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

linkage analysis

A

key to linkage analysis: smaller the amount of recombinaition observed between genes ie the more tightly linked they are, the closer we could infer that they lie on a chromosome
goal is to place genetic markers along chromosomes, order them and assign genetic map distance
genetic markers are sequences of dna with unknown functions but easily recognisable as landmarks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

recombination fraction

A

recom fraction= recombinants/ total offspring x100
recom fraction θ (theta) between 2 loci is the % of times a recombination occurs between 2 loci
θ is a non linear function of the physical distance separating between the loci on the chromosome
θ (theta) =0 no linkage
θ = <0.5 recombination
2 loci are linked if the RF is less than 0.5, loci are not inherited independently

recombination is isieful as it can be used to build linkage maps, chromosomal maps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

features of linkage analysis

A

must have family data with multiple affected individuals
uses relatively few markers(400-800) for whole genome analysis
successful for mendelian disorders, less so for complex
can find potential disease loci located far from marker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

gene mapping

A

these methods use recombinant frequencies between alleles to determine relative distances between them
recombinant frequencies between genes are proportional to thei distacne apart
gene mapping determines the order of genes and relative distances between them in map units cM= indicates a 1% chance that 2 genes were separated by crossing over

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

lod score calculation

A

lod score is a statistical estimate of whether 2 genes or.a disease gene are likely to be located near each other on a chromosome and will be inheroted together
computes values of likelihood function under null and alternative hypotheses
lod scores are the log10 of the ratio between 2 oods
ratio of odds (z)= data linkage/ data no linkage
lod score + 3 indicates linkage
lod score -2 indicated no linkage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

link

linkage procedure

A

decide if linkage analysis is reasonable
collect appropriate families
measure phenotypes and demographic data - family relationships to build pedigree
gentoype markers- at strategic intervals across genome, at locus containing a candidate region
run computer analysis for lod score calculation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

lin

linkage results

A

approximate location of disease gene
placement of disease gene relative to multiple other loci
exclusion of a genome region
many diseases mapped using this- CF,HD, DMD etc

complication factors
- reduced penetrance - not all with risk allele will develop disease
- phenocopies - some without risk allele will develop disease
- gene- gene interaction - homozygosity at another gene may be requirwed
- gene - environment interaction

17
Q

genotypes

A

the use of genotype information can be limited
in large sequencing projects, genotypes collected due to cost considerations
genotypes only tell us the alleles at each individual locus, dont know connection of alleles at different loci

18
Q

haplotype

A

set of dna variations that are usually inherited together
group of genes, genetic regions or markers within an organism that were inherited together from a single parent - combinations of alleles at different loci which segregate together
is only one set of chromosome rather than entire genetic makeup (genotype)
used for association analysis, can tell association of different loci

deducing haplotype- molecular haplotype, genetic analysis, population inference

19
Q

molecular haplotyping

A

begins by isolating sungle molecules or populations of identical molecules of DNA by cloning, molecular biology or physical manipulation
each molecule is then partially or completely sequenced

20
Q

genetic analysis

A

infers haplotypes by applying principles of genetic inheritence data in the context of a pedigree

21
Q

populatiomn inference

A

assings haplotypes from a database to an individual;s genome and then might infer haplotypes on the homologous chromosomes by exclusion

22
Q

snps as markers

A

snps make good markers for haplotype analysis and diseases association
due to LD non random association betwewen alleles at different loci, it is not necessary to sequence and type all these SNPs
within high LD regions, allelic dependence yields redudancy among markers and improves the chances of establishig the approximate location of disease mutation

23
Q

genetic association studies

A

studies test for a correlation netween disease status and genetic variation
altered frequency of a SNP allele of haplotype in a series of individuals affected with a disease
SNPs are most widely used test markers
association studies are a major tool for identifying genes conferring susceptibility to complex disorders

24
Q

association studies

A

detect association between genetic variants and disease- exploits linkage disequilibrium
wide range of association tests based on family studies have been proposed in genetic studies and they require genotyping from affected individual + their parents

25
Q

linkage disequilibrium
aka allelic association

A

refers to the statistical association between pairs of genetic loci, used routinely in localising disease genes, detecting natural selection and studying population history
LD exists when 2 loci are linked and associations between variants exist
when a mutation arises in a population, high LD between the mutation and other variants on the same chromosome may occur
over generations, LD dissipates between mutations and loci far away via recombination

26
Q

association analysis

A

LD is the basis of association analysis (AA)
two loci are associated if the alleles at one locus are not independent of the alleles at another locus
for AA we observe a trait and a marker locus (usually not disease specific locus)
test association between marker and trait
null hyp is no association beterrn marker and trait - rejection implies DSL is in LD with the marker
supurious association occurs when 2 loci are not linekd- assoication between 2 loci that are on different chromosomes

27
Q

types of association

A

direct- mutant or susceptible polymorphism, allele of interest is involved in phenotype
indirect- allele itself is not involved but a nearby correlated marker changes phenotype
spurious- apparent association not related to genetic aetiolgy

28
Q

causes of linkage disequilibrium

A

linkage
mutation
selection
inbreeding
genetic drift
gene flow

29
Q

GWAS

A

genome wide association studies
way for scientists to identify inherited genetic variants associated with risk of disease or a particular trait
surveys the entire genome for genetic polymorphisms, typically SNPs, that occur more frequently in cases ie disease than in controls

30
Q

conducting gwas

A

data collected
genotyping - using microarrays to capture common variants
quality control
imputation- genotypes can be phased and untyped genotypes imputed
association testing - run for each genetic variant, null hyp made
meta analysis - results from mulptiple smaller cohorts are combined using standardised statistical pipelines
replication - results can be replicated using internal or external replication
post gwas analysed - snp to gene mapping

31
Q

manhattan plot

A

shows significance of each variant’s association with a phenotype
each dot represents a SNP, with SNPs ordered on the x axis according to their genomic position, y axis represents strength of their association

qunatile quantile plot showing distribution of expected p values under a null model of no significance vs observed p values

32
Q

association analysis and haplotypes

A

association methods based on LD offer a promising approach for detecting genetic variations that are responsible for complex human diseases
individual SPS may lead to significant findings
methods based on haplotypes comprising multiple snps on the same inherited chromosome may provide additonal power for mapping disease genes