Lecture 14: Genomics Flashcards
(24 cards)
define a genome
the total digital information contained within the DNA sequences of an organism’s chromosomes
the human genome is —– nucleotides
3 billion
define genomics
the branch of biology dedicated to the study of the whole genomes (both human and not human)
define bioinformatics
the science of using computational methods to analyse biological information
define bioprospecting
harnessing the function of newly discovered genes/identifying more targets for therapy
why do we study genomes?
- allows genes and mutations of interest to be identified much more rapidly and easily
- opens up prospects for larger-scale, more complete understanding of how genes interact for biological functions (systems biology)
how was the human genome sequenced?
through the publicly funded human genome project (HGP)
what were the two people and methods that were used to sequence the human genome?
- Francis Collins ‘Hierarchical Shotgun’
- J. Craig Venter Celera Genomics Parallel (whole-genome shotgun)
hierarchical shotgun
- first determine the physical location of large pieces of DNA (large-insert clones)
- sequence these pieces
Whole-genome shotgun
- Fragment the genome into many small pieces.
- Sequence all fragments to get short reads.
- Assemble overlapping reads into contigs.
- Use paired-end reads to link contigs across gaps.
- Build scaffolds from connected contigs.
- Use paired reads to help resolve repeats and place contigs in the correct order and orientation.
what were the achievements that resulted from the human genome project?
- sequencing the human genome
- advancements to sequencing technologies
Moore’s Law
computing power doubles every 18 months
how did the HGP relate to Moore’s law?
advances in sequencing technology (resulting in lowered cost) outpaced Moore’s Law
what 8 things did we learn from the HGP?
- the human genome consists of approximately 3.1 billion base pairs
- the genome is approximately 99.9% the same between individuals of all nationalities
- single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) account for much of the genome diversity identified between humans
- less than 2% of the genome codes for genes
- the vast majority of our DNA is non-protein coding, and repetitive DNA sequences account for at least 50% of the noncoding DNA
- the genome contains ~20,000 protein-coding genes
- nearly 50% genes do not yet have a function
- high degree of conservation between species
why is the species relatedness and genome conservation between humans and other vertebrates important?
because it means we can use other animals as model organisms
how can we overcome the limitations associated with relying on pedigrees?
one can do association studies without paying attention to pedigrees, just by treating groups of individuals that have a trait as though they are related for that trait
define a genome-wide association study, or GWAS
the application of SNP genotyping to large populations of people for the purpose of discovering genetic associations between particular SNPs and traits
what has GWAS led to?
SNPs being identified as tightly linked to, or playing causative roles in, a range of common diseases such as breast cancer, diabetes, Crohn’s etc
what are the advantages of GWAS studies as opposed to pedigree analysis?
- GWAS studies are more broadly applicable and provide greater power and resolution than traditional pedigree analysis
- GWAS studies do not depend on the analysis of closely related family members. there is no limit to the number of humans that can be included in a GWAS test population
- direct comparative studies between affected and unaffected can be performed. we can also map and identify traits’ associated genes that follow any pattern of inheritance, simple or complex
describe the gene for body size in dogs found by GWAS
IGF1 encodes a hormone involved in juvenile growth in mammals and is the major contributor to the difference in size between small and large breeds of dogs
GWAS catalogue contains:
- 7255 studies
- 808, 580 unique SNP-trait associations
HGP -> SNPs -> Ancestry
fragments of genomes carried by our distant ancestors can be observed as blocks of DNA called haplotypes that are shared between many ‘unrelated’ people who are in fact distant relatives
what is an application of the HGP?
we have the power to develop personalised drugs for someone with a particular genetic disease or drugs that alter the genome
describe how a GWAS would be carried out
- take a group of patients with a disease, and a group of non-patients who don’t have the disease
- take their DNA and detect disease-specific SNPs by comparing the differences within the two groups
- use linkage disequilibrium (higher frequency of co-segregation of marker and trait to identify locus of interest)