Understanding our genome Flashcards Preview

BIOC2001 Dina > Understanding our genome > Flashcards

Flashcards in Understanding our genome Deck (86)
Loading flashcards...
1

What is the difference between a genomic and a cDNA library?

A genomic library contains DNA fragments that represent the entire genome of an organism, whereas a cDNA library includes clones that correspond to the mRNA sequences from an organism or from specific cells of an organism.

2

What is the function of oligo dT chromatography?

to separate mRNA from the other RNA in the cell

3

What percentage of RNA in a eukaryotic cell is rRNA?

~ 90%

4

What percentage of RNA in a eukaryotic cell is tRNA?

~ 6%

5

What percentage of RNA in a eukaryotic cell is mRNA?

~ 2-4%

6

What is found at the 3' end of eukaryotic mRNAs?

a poly A tail (added after the mRNA is formed)

7

How does oligo dT chromatography work?

- oligo dT affinity column
- mRNA A tail hydrogen bonds to oligo dT
- rRNA and tRNA cannot bind to the column
- mRNA is eluted using high salt to break A=T bonds

8

How is double-stranded cDNA formed from mRNA?

- mRNA is copied to cDNA using reverse transcriptase, dNTPs, oligo dT primer
- RNA phosphodiester bond cleavage by ribonuclease H
- RNA is replaced by DNA by DNA polymerase
- DNA ligase is used to repair the phosphodiester backbone

9

What does the 'H' in ribonuclease H stand for?

hybrid

10

In sequencing of the human genome, why is only a small amount of enzyme used?

so that the genome is not cut at every restriction site

11

What is a BAC?

Bacterial artificial chromosome

12

What are BACs used for?

to create libraries with large fragments (insert size ~ 100 000 bp)

13

Outline the method of using BACs

- white blood cells are mixed with agarose and placed in a mould
- cell wall is ruptured in the agarose
- restriction enzyme is added to digest DNA in the agarose mold
- each mould is placed in a well of agarose gel
- gel is run and viewed under UV light and DNA of 100 000 bp is excised from the gel
- DNA is eluted from the excised agarose
- DNA is ligated to a plasmid vector excised with the same restriction enzyme
- treated with DNA ligase
- bacteria are transformed
- transformed bacteria are picked into 384-well plates
- bacterial DNA is isolated for sequencing reactions

14

Why are white blood cells used?

- easy to take a blood sample (non-invasive)
- no associated moral concerns

15

What is the function of agarose?

to protect the BACs from mechanical shear

16

What is the size of the mitochondrial genome?

16.6 kb

17

How many genes does the mitochondrial genome encode?

37

18

What percentage of the cell's DNA is made up of mitochondrial DNA?

up to 0.5% due to the hundreds of mitochondrial genomes found in the cell

19

What genes does the mitochondrial genome encode?

2 rRNA genes
22 tRNA genes
13 polypeptide-encoding genes for oxidative phosphorylation

20

What is the H strand?

the heavy stand; G-rich

21

What is the L strand?

the light strand; C-rich

22

How many genes does the H strand encode?

28

23

How many genes does the L strand encode?

9

24

Describe some features of the mitochondrial genome

- circular genome
- genes contain no introns
- genes do not overlap
- the whole strand is transcribed and then cleaved

25

How much of the genome encodes RNA?

10%

26

What types of RNA are encoded?

mRNA, rRNA, tRNA, snRNA, snoRNA, other RNAs eg. telomere RNA, micro RNA

27

What is snRNA?

small nuclear RNA

28

What is snoRNA?

small nucleolar RNA

29

Which protein-coding genes have no introns?

tRNA, histones, α-interferons

30

What is the advantage of histones having no introns?

During the S-phase of the cell cycle, a vast quantity of histones is needed for the formation of the newly synthesis chromatin. The intronless organisation of histone genes may facilitate a highly efficient organisation of histone synthesis.

31

What is the longest human gene?

dystrophin (2.6 kb)

32

How long does it take to transcribe dystrophin?

16 hours

33

What is the implication of the long transcription time of dystrophin?

There is a large amount of time in which mutation can occur

34

What percentage of dystrophin DNA is protein-coding?

0.6%

35

Which genes is the globin family comprised of?

alpha globin gene cluster on chromosome 16
(embryo, fetus and adult)
beta globin gene cluster on chromosome 11
(embryo, fetus, adult)

36

How do gene families arise?

due to gene duplication

37

Why is the number of genes important?

in order that the right amount of protein is synthesised

38

Give an example of the importance of gene number

- α-thalassemia is caused by a deficiency of α-globin genes
- the alpha globin cluster is found on chromosome 16 and encodes two alpha globins for fetus and adult
- this makes a total of four alpha globins for fetus and adult in each cell, since an individual inherits one copy from each parent
3α = α-thalassemia trait
2α = mild anaemia
0α = hydrops fetalis

39

Why does the baby die at birth?

The intact genes for embryonic alpha globin means that the embryo can survive in the womb, but the baby dies soon after birth

40

How many histone clusters are there in humans?

11 cluster

41

How many histone genes are there in humans?

60 genes

42

Over how many chromosomes are the histone genes spread in humans?

over 7 chromosomes

43

What is the relationship between members of a gene family?

each member encodes the identical protein (highly conserved)

44

What would happen without histones?

Without histones, DNA could not be compacted into the cell nucleus and would not fit into the cell.The compacted molecule is 40 000 times shorter than the unpacked molecule.

45

What percentage of the genome is made up of non-coding DNA?

90%

46

What does 'non-coding' mean?

does not code for RNA (except micro RNA)

47

What types of non-coding DNA exist in the genome?

- introns, regulatory regions
- pseudogenes - redundant, produced as part of the evolution of genes
- gene fragments - also produced as part of the evolutionary process, micro RNAs - regulate gene expression

48

Give an example of a pseudogene

two Ψ genes on the alpha globin cluster on chromosome 16
one Ψ gene on the beta globin cluster on chromsome 11

49

Why are pseudogenes non-coding?

Pseudogenes are genes that have picked up mutation and lost their function, since they are now unable to bind RNA polymerase to be transcribed.

50

What are processed pseudogenes?

- DNA underwent transcription, splicing and polyadenylation to form mRNA
- mRNA was converted back into DNA by reverse transcription
- the DNA was re-integrated into the host genome to form a pseudogene
- due to viral infection during evolutionary history

51

Where are processed pseudogenes found?

often found on a different chromosome to the functional gene

52

What is a transposable element?

also known as a transposon; a DNA sequence that can change its position within the genome

53

What are Alu elements?

- the most abundant transposable elements in the human genome
- primate-specific
- do not occur in exon sequence

54

How many Alu elements are estimated to be in interspersed throughout the human genome?

over one million

55

What percentage of the human genome is estimated to consist of Alu elements?

10.7%

56

How big are Alu elements?

~ 280-300 bp in length

57

How can Alu elements be detected?

by digestion with restriction endonuclease Alu1
one main resulting band = one prevalent sequence

58

What is the relationship between Alu elements and mutation?

Alu elements are a common source of mutation in humans, but these are often confined to non-coding regions

59

What is the structure of an Alu element?

- identical target site duplication (TSD) sites on either side of the Alu dimer
- dimer comprised of two similar but distinct monomers (left and right arms) joined by an A-rich linker

60

How is the structure of the Alu element believed to have come about?

dimer emerged from the fusion of two distinct monomers over 100 million years ago

61

How long is the polyA tail?

the length of the polyA tail varies between Alu families

62

Where are Alu elements thought to be derived from?

Alu elements are thought to be derived from the small cytoplasmic 7SL RNA (the signal recognition particle RNA), a universally conserved ribonucleoprotein that directs the traffic of proteins within the cell and allows them to be secreted.

63

Is the Alu repeat a SINE or a LINE?

SINE

64

What does SINE stand for?

short interspersed nuclear element

65

What does LINE stand for?

long interspersed nuclear element

66

What does TSD stand for?

target site duplication

67

Why are the TSD sequences identical?

- the DNA was originally circular, with the TSD sequences hydrogen bonded to one another
- circular sequence cut with restriction enzyme and overhang blunted by the addition of free nucleotides to form two identical sequences, one on each end

68

What percentage of our genome is made up of LINEs?

17-20%

69

What is the structure of a typical LINE?

consists of two non-overlapping open reading frames (ORF), which are flanked by UTR and target site duplications

70

What is encoded by the first open reading frame?

a RNA-binding protein of 500 amino acids that functions as a chaperone

71

What is encoded by the second open reading frame?

a protein-complex that has endonuclease and reverse transcriptase activity

72

How do LINEs promote their own transcription?

- promoter for transcription
- reverse transcriptase to copy RNA into DNA
-endonuclease to cleave target DNA
- RNase H for RNA removal (can digest RNA hydrogen bonded to DNA)

73

Outline the insertion of LINEs into the genome

- transcription of LINE mRNA
- translation of ORF1 and ORF2 proteins, which bind to LINE mRNA at the polyA tail
- mRNA binds to target DNA at AT-rich sequences to form RNA-DNA hybrid
- LINE endonuclease cleaves target DNA
- LINE reverse transcriptase copies LINE mRNA into cDNA
- LINE-encoded RNase H degrades the RNA strand
- second-strand synthesis and repair occurs

74

What happens to LINEs over time?

- LINEs shorten as they 'age'
- most are truncated at the 5' end to remove the promoter
- RNA polymerase cannot bind and truncated LINEs cannot be transcribed

75

What are some consequences of LINEs being inserted into the human genome?

- disruption of gene transcription
- insertion into a promoter can silence a gene
- insertion into an intron can slow down transcription

76

What is the consequence of LINEs being inserted into an intron?

- the slowed rate of transcription enables a higher frequency of mutation
- proteins may be synthesised too slowly to meet the demands of the cell

77

How does LINE insertion differ in humans?

in number and position in the genome

78

What is the frequency of LINEs in somatic cells and in the germline?

LINEs are rare in somatic cells and more abundant in the germline

79

How does the cell protect against LINEs?

heavy methylation of LINEs to silence them by preventing the binding of RNA polymerase and other proteins required for transcription

80

What percentage of cells in our body are microbes?

90%

81

What percentage of functional genes in our body are microbial?

99%

82

How do commensal microbes help the body?

- form us eg. help immune system development
- feed us eg. process food, provide vitamins
- protect us eg. fight undesirable pathogenic bacteria

83

What genetic information is targeted in sequencing the microbiome?

16S rRNA
- must specifically target bacterial RNA
- small subunit of the ribosome
- common to all bacteria
- present in one or more copies

84

What regions do the bacterial 16S rRNA genes have?

variable (v) regions and conserved regions (common between species

85

Outline one method of bacteria identification

1. isolate DNA from faecal sample
2. amplify bacterial 16S rRNA gene with primers that encompass variable regions
3. sequence 16S rRNA gene amplified product of variable gene
4. data analysis and processing of gene sequence
5. taxonomix classification using reference databases
6. relative abundance of species within sample

86

Outline another method of bacteria identification

1. isolate DNA from faecal sample
2 sequence all DNA using next generation sequencing method
3. data analysis and processing of bacterial rRNA gene sequence