Human genetics 2 Flashcards
(11 cards)
1
Q
How tag SNPs identify a haplotype (HapMap)
A
- a series of adjacent SNPs from a haplotype
- block of DNA which tends to get inherited together
- the closer SNPs are, the less recombination happens between them
- SNPs are useful markers for gene identification
2
Q
How recombination frequency is related to genetic distance
A
A typical genome
- differs from the reference at 4.1-5M sites
- > 99.9% of variants consists of SNPs and short insertions and deletions
- structural variants (although fewer) affect more bases
- copy number variation (CNV)
- 2 100 to 2 500 variants in a typical genome
- affecting about 20 million bases of sequence
3
Q
What is CNV
A
- sequences are more than 1000 bp
- copies of these sequences vary between individuals
- tandem or interspersed
- gene dosage and breakpoints may affect the phenotype
4
Q
CNV mechanism
A
- Like deletion/duplication
- maybe caused by NAHR
- improper repair after double-strand breaks
5
Q
1000 genomes project
A
- more variants in African populations
- more recent populations have less
- number of variants/genomes and number of singletons (unique) variants/genomes
- older populations tend to have more of both types of variants
6
Q
How the function of a gene can be determined
A
- biochemical approaches are laborious
- express DNA sequence, isolate protein, test for function
- Functional genomics
- predict protein function from genetic information using an algorithm
- search for homology in the same and different species
7
Q
How microarrays can determine gene expression
A
- can preform a variety of experiments but all rely on complementarity
- 1000s to millions of DNA spots
- each spot contains specific DNA probes
- 1000s to millions o fmolecules of probes in each spot
- affixed to glass slides
- labelled DNA or cDNA hybridized
- hybridization detected optically
8
Q
What are the limitations of Microarray and RNA sequencing
A
- require prior knowledge of gene sequences to design
- similar sequences may hybridize to the same probe
- quantification is difficult
- RNA sequencing (RNA-seq) does not have these limits
- with low-cost next-gen sequencing, this has become more popular
9
Q
What is ENCODE?
A
- attempts to describe all “functional” areas of the genome
- a discrete genome segment that encodes a defined product
- protein or non-coding RNA
- displays a biochemical signature
- protein binding, specific chromatin structure
10
Q
What were the results of ENCODE?
A
- 20 687 protein-coding genes
- 6.3 alternatively spliced transcripts/gene
- 75% of the genome is transcribed
- 62% of the genome is represented in RNA >200 bp
- most of that RNA not translated to protein
- Histone modification, DNA methylation, chromosome-interaction regions
11
Q
What was the take home message of this lecture?
A
80.4% of the genome is functional