Genome Organization Flashcards

1
Q

What are the components of chromatin?

A

DNA + histones + non-histone proteins (acidic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How many base pairs are in the haploid human genome sequence?

A

3e9 base pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What were some of the findings from the first human genome sequence?

A
  • Human genome is not static
  • There is no “one” human genome; there are many.
  • Genome is not organized in a random manner
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give three examples of how the human genome is not static.

A
  1. ~30 new mutations occur in each individual
  2. Shuffling of regions occur at each meiosis due to recombination
  3. Somatic DNA changes can be produced as well as germ-line changes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is single nucleotide polymorphism?

A

A comparative difference in a single base pair

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If there is an average of 1 SNP for every 1000 base pairs between any two randomly chosen human genomes, approximately how many differences are there?

A

3 million

even though 99.9% identical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which chromosome is classified as “gene-rich”?

A

Chromosome 19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which chromosomes are classified as “gene-poor”?

A

Chromosomes 13, 18, 21

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the differences between euchromatic and heterochromatic regions of the genome?

A

Euchromatic regions are more relaxed and the focus of genome sequencing effort; still many unsequenced gaps in the regions; make up most of the genome

Heterochromatic regions are more condense with more repeats; the regions are essentially unsequenced; tend to be near centromeres and make up less of the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the general genome composition?

A

1.5% is translated
20-25% is represented by genes
50% “single copy” sequences
40-50% classes of “repetitive DNA” related hundred of millions of times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Acknowledging that GC-rich and AT-rich regions are not random, what percent of the genome is GC- and AT-rich?

A

38% GC-rich

54% AT-rich

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the two classes of repetitive DNAs?

A
  1. Tandem repeats (“satellite DNAs”)

2. Dispersed repetitive elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are two of the locations of certain tandem repetitive DNA?

A

A particular pentanucleotide sequence is found as part of a heterochromatic region on the long arms of Chromosome 1, 9, 16, and Y, which are hotspots

“alpha-satellite” repeats are a 171 bp repeat found near centromeric region of all chromosomes; may be important to segregation during mitosis/meiosis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give an example of a short interspersed repetitive element.

A

Alu family
~ 300 base pair related members
500,000 copies in the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Give an example of a long interspersed repetitive element.

A

L1 family
~6 kilobase pair related members
100,000 copies in the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are dispersed repeats in the genome and how can they be medically relevant?

A

They are retrotransposition elements that can effectively copy their own sequences into other locations in the DNA
=> retrotransposition of a copy into the middle of another gene may inactivate that gene
=> NAHR leading to disease

17
Q

What is NAHR (non-allelic homologous recombination?

A

When repeats facilitate aberrant recombination events between different copies of dispersed repeats leading to disease

18
Q

What are the types of DNA variation that occur between genomes?

A

Insertion-deletion polymorphisms
SNPs (single nucleotide polymorphisms)
CNVs (copy number variations)

Other:
Chromosomal variation, larger scale variation, rearrangements, translocations
Silent variants

19
Q

What are the two types of insertion-deletion polymorphisms?

A

Minisatellites:
tandemly repeated 10-100 bp blocks of DNA = highly variable number
VNTR (variable number of tandem repeats) = can be used for genetic fingerprinting

Microsatellites:
di-, tri-, and tetra- nucleotide repeats
more than 5e4 per genome
aka STRPs (short tandem repeat polymorphisms)

20
Q

Why can SNPs be used for genetic fingerprinting?

A

SNPs can be detected by PCR markers
They are easy to score
They are widely distributed

21
Q

What is copy number variation?

A

Variance in the number of copies of a particular gene in an individual
In segments from 200 bp to 2 million bp

22
Q

What is a gene family and how did they arise?

A

Gene families are genes that have high sequence similarity, over 85%, that may carry out similar but distinct functions

They arise through gene duplication; when a gene duplicates it frees up one copy to vary while the other copy continues to carry out a critical function

23
Q

What is structural variation of the human genome?

A

All changes in the genome are NOT due to single base pair substitutions
=> CNV (copy number variations) is the primary type of structural variation

24
Q

What are the characteristics and implications of CNV (copy number variations)?

A
  • CNV loci are the primary type of structural variation and may cover 12% of the genome
  • CNV is implicated in increasingly larger number of diseases
  • CNV regions are involved in rapid/recent evolutionary change
25
Q

CNV regions involved in recent evolutionary change are often enriched for what?

A
  • human specific gene duplications
  • genome sequence gaps
  • recurrent human diseases
26
Q

What are the limitations of NextGen DNA sequencing?

A
  • No mammalian genome has been completely sequenced/assembled
  • NextGen relies on short read sequences => complex regions go unexamined, these can be regions implicated in many diseases (i.e. 1q21)
27
Q

What are the limitations of GWAS (genome-wide association studies)?

A
  • Many large scale studies implicate loci that account for only a fraction of the expected genetic contribution = ‘missing heritability’
  • Many regions of the genomes are unexamined by available ‘genome-wide’ screen technologies
28
Q

What makes up the histone octamer?

A

Two copies each of four core histones:

H2A, H2B, H3 & H4.

29
Q

Human cells have hundreds to thousands of mitochondria. What are the features of the mitochondrial genome?

A
  • Reside in cytoplasm => inheritance is exclusively maternal (allows tracing of maternal lineages over human evolutionary history)
  • 16 kb genome; few dozen genes; uses slightly different triplet code; vast majority of proteins that function in mitochondria are encoded by nuclear genes
  • Several genetic diseases caused by mutations in mitochondrial genes.
30
Q

What is the estimated number of human genes, and what are the different types of genes?

A

25,000-30,000 genes

▪ protein-encoding genes
▪ RNA-encoding genes
▪ “pseudogenes” = non-functional but homologous copies of existing genes.