Week 3.2: The Human Genome Flashcards
(17 cards)
Human Genome
features (6)
- 3.05 billion bases
- linear nuclear genome
- circular mitochondrial genome (16.5 kb)
- average gene size = 27,000 bp
- 19,969 protein coding genesv (<2% of genome)
- 43,525 non coding genes (inconclusive)
Telomere to Telomere Consortium
- reported complete sequence of female genome in 2022
- published human Y chromosome from HG 002 genome that corrects multiple errors in HG 38 Y chromosome and added over 30 million bp of sequence to reference
- potentially 0.3% errors
Telomere to Telomere Consortium
CHM 13
special cell line almost completely homozygous for all chromosomes
Human genome
Difficult sequencing areas
- centromeres and telomeres (heterochromatin)
- AT/GC rich regions
- palindromic sequences
- hairpin structures
diploid
- An organism with 2 copies of each chromosome
- Human genome: 3.2 billion bases in each cell (haploid), 6.4 billion (diploid)
Repeat sequences
sources
- retroviruses
- transposable elements (eg. Alu = 11% of genome)
- simple repeat sequences
transposable elements
- pieces of DNA that make self copies and insert elsewhere in genome
- can sometimes cause disease
- many have been present in genome for millions of years
- originally considered junk DNA, now a regulatory element
genomic variant types
3
- single nucleotide variant (SNV)
- Insertions and deletions (indels)
- structural variants (SV)
single nucleotide variant
(SNV)
- smallest and most common genomic variant
- one nucleotide change at specific location in genome
- includes SNPs and rare single nucleotide differences
- may have SNP on one or both homologous chromosomes
single nucleotide polymorphism
(SNP)
SNV that’s present in at least 1% of human population
insertions and deletions
(indels)
- extra or missing DNA nucleotides in a genome
- typically < 50 nucleotides
- sometimes have larger impact on health and disease
- common type are tandem repeats
indels
tandem repeats
- aka microsatellite
- short stretches of nucleotides repeated multiple times
- highly variable in size (2-3x to 100x)
- number of repeated units unique to each person and can be used for personal / relationship ID
phased variants
Variants on same chromosome are linked together and inherited from one parent
structural variants
(SV)
types (5)
- large tandem repeats (repeated unit >15 bp)
- copy number variants (CNV)
- inversions
- insertions
- translocations
structural variants
inversion
segment is inverted within chromosome
structural variant
insertion
segment deleted from one chromosome and added to different chromosome
translocation
segments that transfer (swap) between different chromosomes