Lecture 5 Flashcards

Question 1

Q

Simple genetic disorders: Autosomal dominant

Answer

A

Only one copy/allele required for the disease
Most affected people only have 1 disease allele
Equally common in both sexes
Offspring of affected people have 50% probability of inheriting the disease

Question 2

Q

Simple genetic disorders: Autosomal recessive

Answer

A

Two alleles required for the disease
Equally common in both sexes
Offspring of two carriers have 25% of inheriting disease
Disease alleles are ‘masked’ in heterozygous carriers
Have this skipping of parents- only the way you get it is if both the parents are carriers.

Question 3

Q

Simple genetic disorders: X-linked recessive

Answer

A

Females require 2 disease alleles, males only 1
More common in males
Sons of carrier females have 50% chance of disease
Sons of affected males are unaffected
Hardest thing to distinguish

Question 4

Q

Mapping Mendelian Traits

Answer

A

Non-recombinants - NR(8/10): Offspring of affected people that inherit Allele 2 tend to get the disease and offspring that don’t inherit allele 2 are unaffected

Recombinants - R (2/10): Offspring with Allele 2 but not the disease, or the disease but not Allele 2

Question 5

Q

Linkage mapping to find traits

Answer

A

Mendelian Traits are typically rare
Usually mapped by following the co-segregation of markers and phenotypes in affected families
Results often expressed as a LOD score (Logarithm of Odds)
Markers with the highest LOD scores are closest to the gene.
LOD of 3 means linkage between a marker and a gene is 103:1 i.e. 1000:1 more likely linked to the gene than non-linkage
Online Mendelian Inheritance in Man (OMIM) is an online database with descriptions of genes, literature, phenotypes etc related to each disease

Question 6

Q

Whats LOD score

Answer

A

A probability of linkage between the marker and the gene relative to the marker not being linked to the causal gene. LOD scores go up in units of 1, 2, 3 and 4 etc.
10 raised to the power of that number. A lod score of 3 but be 10^3

Question 7

Q

Examples

Answer

A

Most of these disease causing alleles are very rare.

A lot of these things vary in their frequencies between populations because of the effects of genetic drift. Even when these things are rare, because they have a big effect on the phenotype we can find the genes responsible

Question 8

Q

Mapping complex diseases and the two approaches

Answer

A

Common diseases e.g. heart disease, cancers, dementia, susceptibility to malaria etc are typically complex and involve a mixture of genetic and environmental causes

Two popular approaches trying to find genes responsible for these traits:

Exome capture - just sequence the coding bits of the genome (Lecture 2)
Genomewide association studies (GWAS- more common)- typically used snip chips

Question 9

Q

Concept behind GWAS

Answer

A

A new mutation arises that causes or contributes to a disease
Initially most of the linked SNPs will be in linkage disequilibrium i.e. statistically associated with it
But over time, recombination will break up these associations. Only the most closely linked loci will remain in LD
New chromosome on an ancestral chromosome
-Chromosomes in modern day descendants who inherit will have allele 2 at the marker locus significantly more often than in the general population

Question 10

Q

GWAS – plotting the results

Answer

A

A GWAS typically involves typing a million SNPs in cases and controls
Every SNP is tested for an association with the trait/ phenotype of interest
- They will produce something known as a Manhattan plot- because you’re looking for sky scrapers
- They produce a P value for each SNP - if you take a log of that and reverse the sign you can create a statistic of a -log 10 P, the higher it is the better
Usually the expected and observed test statistics (results from chi squared tests) are plotted against each other
If the observed values are higher than expected, there could be a risk of false positives due to population structure
Anything above the line is strongly suggestive of those SNPs being associated with your trait. The line is roughly 0.00007.

Question 11

Q

QQ plots – detecting structure

Answer

A

Each point is a SNP
X axis is expected -log10 P values and Y axis is observed P values
If line is above X=Y, P values across the whole genome are more significant than expected under the null hypothesis. This suggests we could get false positives in a GWAS.
Most likely cause is population structure- allele frequencies differ between different populations and they can cause false positives in a GWAS

Question 12

Q

False positives in GWAS studies

Answer

A

If there is genetic structure in a population, then false associations between a marker and a phenotype can arise by chance

Question 13

Q

Studies of genetic structure

Answer

A

Observation that genetic structure influences GWAS results is important
It means that SNP chips that are good for finding disease variants in one population might not be so good in another population – motivation for HapMap projects (Lecture 2)
We need to understand human population genetic structure ……. And this can tell us about our history

Question 14

Q

Estimating and displaying human structure: two main approaches

Answer

A

Clustering: Idea is to group individuals into K different clusters, where individuals within a cluster are more similar to each other than individuals outside of it. Can use an a priori number for K or K can be estimated from the data. Best known program/method is Jonathan Pritchard’s STRUCTURE. Each individual is given a membership coefficient which tells us how well it fits its cluster and whether it contains genes from >1 cluster.

Try to work out how many distinct genetic structures

Multivariate approaches: Best known approach is Principal Component Analysis (PCA) which uses allele frequencies from many markers

Question 15

Q

Human Population Structure

Answer

A

Used microsatellite markers (for runners for SNPS) - didn’t have as much genetic variation.
93-95% of genetic variation within populations- not between populations but within populations.
However, there are some subtle differences still possible to identify different genetic clusters
A value of 2 splits East Asias, Oceanias and Americas from EuroAsia and Africa
Plots like these known as ‘Structure plots’ (after the program first written to identify clusters).
Each colour is a distinct genetic cluster
The number of colours will represent the value of K- trying to work out how many genetic clusters there are. Beginning to identify slightly different genetic structures.

Question 16

Q

Example of a PCA approach

Answer

A

> 3000 Europeans typed at 500K SNPs
Restricted analysis to people with all 4 grandparents from same location from one another- tried to work out how the genetic variation was partitioned.
First two principal components separate out discrete populations
PC1 roughly SE-NW
PC2 roughly SW-NE
PC1 and PC2 only explain ~2% of the variation, but this is still enough to reveal structure- if you rotate the axis- you can see that you can capture a map of european geography very well. Most of the variation is found within the populations
Subtle differences in allele frequencey - tell us something about europe

Question 17

Q

Origins of the UK population

Answer

A

Typed 2039 people from a Wellcome Trust study on multiple sclerosis (controls)- usually motivated by a gene mapping project
Same samples were part of the People of the British Isles (POBI) project- carefully selected so they came from the location all of their grandparents were from
Selected people with all 4 grandparents born within 80km of each other
Mean birth date 1885 – samples represent UK population before greater movement of 20th century
Typed at ~500K SNP markers
Another 6029 samples from 10 European countries (to provide context)

Question 18

Q

Fine-scale structure in the UK

Answer

A

17 discrete clusters, that are geographically separated

Even close locations form discrete clusters (e.g. Cornwall and Wales)

FST between clusters is very low – 0.002

There is not one single ‘Celtic’ population

English cluster (red) is very large, perhaps because of fewer geographical / geo-political boundaries

The peripheral populations had the more unique colour. The genetic differentation between these clusters is very low- almost identical in genetic structure. Genetic drift more prominent in the more remote areas due to population size

Question 19

Q

Inferring the origin of UK clusters

Answer

A

Lots of variation - people in North Wales had a big input from FRA17 etc.

Refer to previous slide

Groups that contribute to all UK clusters (e.g. GER6, BEL11 and FRA14) probably represent earliest ( post ice-age) migration events in the UK.
Some European clusters probably correspond to known historical events
e.g. Nor53 and Nor 90 (Norwegian) groups contributed to Orkney clusters. Probably reflect Norse/Viking invasions
GER3 and DEN18 probably represent early Saxon migration events contributions around 700AD.
Data are consistent with the idea that Saxons contributed to current genetic variation rather than completely replaced ice-age settler genotypes

Question 20

Q

Geographical variation in complex diseases

Answer

A

Two obvious questions

Environmental or Genetic causes …. or both?
Why do disease associated alleles persist?

Type 2 Diabetes (T2D) shows a broadly similar geographical pattern to obesity.

It is very common, and it varies between populations. Why?

Thrifty genotype hypothesis (Neel 1962). In our ancestors, a rapid release of insulin in response to elevated blood sugar was useful. It enabled the build up of fat stores, which could be used in times of hardship i.e. diabetes associated alleles were once advantageous

Drifty genotype hypothesis (Speakman 2008). In our ancestors’ lipid storage genes mutations were neutral, because people didn’t have a fatty diet. Population differences in allele frequencies through drift. With modern high-fat diets the effects of the mutations are more obvious. More like a null hypothesis.

Little support for the thrifty genotype hypothesis; harder to test the drifty hypothesis

Brainscape's Knowledge GenomeTM

Lecture 5 Flashcards

Brainscape's Knowledge Genome^TM