Week 6 Flashcards

1
Q

What are the key features of prokaryotic DNA?

A

Very economical
Compact Organisation
Operon structure
Lack of repeat elements
Strong correlation between genome size and gene number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key features of eukaryotic DNA?

A

Large
Large number of non-coding introns (25% of DNA in humans)
Small number of exons (1.5% in humans)
Large number of repetitive elements
Poor correlation between genome size and protein coding gene numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is there a strong correlation between genome size and number of genes in prokaryotes?

A

There are very few empty regions which are quite small. Nearly all space is occupied by a gene. There is linear relationship between genome size and number of genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is contained within eukaryoic genomes?

A

Large proportion of transposable elements
Large proportion of other repetitive elements
Exons in genes represent a small proportion of the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 3 main regions of eukaryotic chromosome?

A

Telomeres
Long/Short arm
Centromere

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of ploid are most eukaryotes?

A

They are diploid - two sets of chromosomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are polyploid organisms?

A

They have more than 2 sets of chromosomes (can be specified further eg tetraploid- 4)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How many angiosperms are polyploid?

A

70%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Where does polyploid comes from?

A

From a process of whole genome duplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are examples of animal polyploidy?

A

Salmon 4x
Frogs 4x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are examples of crop polyploidy?

A

Potato 4x
Strawberry 8x
Rice 12x
Wheat either 6x or 4x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is autopolyploidy?

A

Polyploids with multiple chromosome sets derived from within a single species often a result of meiotic error where gametes fail to reduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the process of autopolyploidy?

A

A karyotype of parent species wull undergo meiotic error. This means the gametes produced will have full number chromosomes rather than half. If the gamete undergoes self-fertilisation then it will form a 4n zygote.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is allopolyploidy?

A

Polyploids result from a hybridisation event
These end up with two sub-genomes one from each of the progenitor species
Potential way for new species to arise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the process of allopolyploidy?

A

One organism will undergo miotic error to form unreduced gamete. Then the unreduced gamete is involved with a hybridisation event. The other species chromosome often only has a single copy. This odd numbered animal will reduce with a progenitor species so all chromosomes from orginal species are 2n. This produces a viable offspring that will be a hybrid of the 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the 2R hypothesis?

A

it proposes that the early vertebrate lineage underwent two complete genome duplications
1R- When Jawed fish broke off from Jawless vertebrates
2R- When bony jawed vertebrates broke off from cartiliagonous fish

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is FSGD?

A

Fish Specific Genome Duplication otherwise called the 3R hypothesis in teleosts (type of bony fish)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the benefit of genome duplication events?

A

Both veterbrates and teleost’s have demonstrated huge adaptive radiations which may be linked to genome duplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What happens when an animal undergoes genome duplication?

A

Huge selective pressure to rediploidise. There are scars of the genome duplication with pseudogenes, which mirror genes but dont have a promoter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is duplication chromosomal rearrangement?

A

Increase in copy number of a chromosomal region (segmental) or single gene (local)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is inversion chromosomal rearrangement?

A

Chromosomal segment is inverted due to breakage and rejoining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is translocational chromosomal rearrangement?

A

A mutation causing one portion of the chromosome to move to a different part of the chromosome or onto a new chromosome (reciprocal, non reciprocal and Robertsonian/whole chromosome (2 chromosomes joined together))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is transposition chromosomal rearrangement?

A

Movement of a short DNA segment around the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the uses comparative genomics?

A

Help identify:
Conserved protein coding regions
Conserved transciption control sequences
Mechanisms of chromosomal evolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the similarities between mice and humans genomes?

A

Almost all genes found in one species is found in another
Protein coding regions of the mouse and human genomes are 85% identical
Around 217 conserved synthetic blocks have been found between human and mouse genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the difference between humans and great apes with chromome count?

A

Humans have 2n of 46
Great apes have 2n of 48

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Why do humans have 2 fewer chromosomes than apes?

A

Chromosome 2 in humans formed as a result of fusion of two smaller chromosomes - Chimp chromosome 12 and 13

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a difference on chromosome 3 between humans and Orangutan homologue?

A

There is chromosomal inversion in the Orangutan homologue of chromosome 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How long is Human DNA?

A

Each diploid human cell contains around 2m of DNA
Human body contains roughly 50 trillion cells- enough to go from sun to earth 300x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How is DNA packaged?

A

DNA is complexed with positively charged histone proteins to generate chromatin
Histone sequences are highly conserved in eukaryote genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are the two types of chromatin?

A

Heterochromatin is tightly packed - often where noncoding regions are
Euchromatin is more loosely packed - This allows for transcription to happen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What marks heterochromatin?

A

They are marked by histone-modifying enzymes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the structure of human centromere?

A

Highly repetitive
AT rich alpha satelite monomers (171 bps)
Satellites are tandemly repeated into high order repeats
Kinetochores assemble during cell division to link centromere to spindle fibres
Pericentric heterochromatin forms around centromere making genome silent compartments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the role of telomeres?

A

Essential for maintenance of linear chromosomes
In the abscence of telomeres chromosomes would shorten each replication cycle
Prevent DNA repair systems from mistaking end of chromosome for a double stranded break

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is the length of telomeres?

A

They are comprised of 250-1500 TTAGGG repeats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Why would human chromosome shorten each time without telomeres?

A

DNA polymerase cannot construct the 3’ end of new DNA strand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are included as Repetitive elements?

A

Structural repeats (centromeres, telomeres etc)
Pseudogenes
Simple sequence repeats/ microsatellites (2-5bps in length)
Transposable elements (c.45% of human genome, though small proportion are active (less than 0.05%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Who identified Jumping genes?

A

Barbara McClintock

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What was the work of Barbara McClintock?

A

Indentified two dominant genetic loci names Dissociation and Activator
She noticed that Dissociation caused chromosomes to break and had effects on neighboring genes when in the prescence of Activator
She noticed that both loci could change position on chromosomes
That Activator controlled the transpostion of Dissociation and that when Dissociation was moved the chromosome broke

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

How did Barbara McClintock indentify Jumping genes?

A

She observed the effects of their movement through changing colour patterns in maize kernals over generations and controlled crosses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What was the name of Jumping genes that McClintock observed?

A

McClintock observe was Type II transposon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What are type 2 transposons?

A

These use a “cut and paste” mechanism to get around
Produce mutations and target sequence duplications when inserted into genes
These are able to replicate during S phase of the cell cycle when a donor site has been replicated but the target has not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What are type 1 transposons?

A

They use a “copy and paste” mechansism of replication
These have a similar characteristic as reteroviruses such as HIV
Produce mutations when they inset into genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What caused humans and apes to lose their tail?

A

They lost their tale due to an insertion of an Alu element (transposible element) into the intron of TBXT gene lead to homonid specific alternative splicing event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

How many protein coding genes are their in humans?

A

20,000 to 25,000
Around 100,000 predicted
These genes generate the proteome which is more complex than lower eukaryotes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is the homologues for proteosomes in other model organisms?

A

61% for D.Melanogaster
43% for C.Elegans
46% for Yeast

47
Q

What are non protein coding gene include?

A

tRNAs, rRNAs for RNA processing
snRNAs for intron removal
microRNAs which have a role in control of gene expression

48
Q

How much of the human genome is made of introns?

A

20%

49
Q

What are the 3 important motifs within the intron?

A

Donor site
Branch site
Splice site

50
Q

What catalyses the removal of introns?

A

The spliceosome makes an insition at the donor and splicer site. The intron is then looped togther so it cant interfere with the mature mRNA

51
Q

What can be seen with RNA splicing with the Human dystrophin gene?

A

Human dystrophin is spread over 2.5 Mb
Primary transcript contains over 80 introns
Mature RNA is only 14,000 bases long

52
Q

What is alternative splicing?

A

Alternative splicing joins exons in various combination, this is an economical way to create protein diversity

53
Q

How many human genes can be alternatively spliced?

A

65-70% of human genes

54
Q

What is an example of alternative splicing in humans?

A

Multiple promoters for each of the three human Neurexin genes
There are five exons for which alternative splicing can occur
it isnt known how many of the varients are functional

55
Q

Why is genome sequencing useful?

A

Helps us understand variation between individuals (eg SNPs and INDELs)
Important for understanding genetic diseases
Charcterise difference in strains/varities/populations
Develop diagnostic assays for pathogens
Marker assisted selection in plants and animals

56
Q

What can be determined by genome sequencing?

A

Identify protein coding and non-protein coding genes
Map gene regulatory elements
Study genome organisation and function
Understand mechanisms of genomic function

57
Q

Hoe might DNA size negatively impact genome sequencing?

A

DNA molecules are large
Bacterial genomes c4 million bases
Human genome 3000 million bases (3Gb)
Wheat genome 17 Gb

58
Q

How can DNA sequence negatively impact genome sequencing?

A

Bias in GC/AT content (stability GC bond more stable)
Repeat element content (some genomes can have >80% repeats)
Paralogs

59
Q

How can DNA sampling impact genome sequencing?

A

Pure samples are best- not too hard with fresh samples
Ancient DNA- much harder

60
Q

What does Sanger sequencing require?

A

Reactions contain: template DNA, a primer, deoxynucleotide triphosphates (ATP, TTP, GTP, CTP), DNA polymerase - like normal PCR
Also di-deoxynucleotide triphosphates (ddNTPs) which lack a 3’OH grouo so cant form a phosphodiest bond- when these bind to the reaction stop

61
Q

How would you run a sanger sequencing?

A

Add your PCR to different tubes with low concentration of either ddATP, ddTTP, ddGTP or ddCTP. Then run them on a gel electrophoresis, then at each position you will be able to identify what each nucleotide is and its sequence

62
Q

What is the problem wuth sanger sequencing?

A

It is inefficient if you are lucky with both forward and backward reactions you may get 300 bp
It needs to be run 4 times so takes 1 days work for 300 bp and requires lots of DNA

63
Q

How did they advance Sanger sequencing?

A

Fluorescent chain terminators menat the whole reaction could happen in a single tube
Each of the 4 ddNTPS is labelled with a different fluorescent dye which emits light at a different wave-length
Massively increasing efficiency

64
Q

How does Capillary electrophoresis work?

A

Samples are seperated by size using a long thin capillary instead of electrophoresis gel
A sample is injected and forces through a capillary
Then lasers shoot through the capillary fiber casuing the colour tags to fluoresce which is detected by a camera

65
Q

What is the disadvantages of Capillary electrophoresis?

A

It is only good for 300 bp while using both forward and backward reactions

66
Q

What is the overview of Sanger sequencing?

A

Basic dideoxynucleoside termination chemistry was developed in 1977
First automated in 1990
If you run both direction you can get a 300bp sequence
Samples must be clean and you cant multiplex samples
Error rate of sequencing is <0.1% but will incorporate PCR error which is ~4% over 30 cycles

67
Q

When did genome sequencing begin?

A

1990 when Sanger sequencin became automated

68
Q

When was Drosophila melanogaster genome sequenced?

A

2000

69
Q

When was E.Coli first sequenced?

A

1998

70
Q

When was the first draft of human genome sequenced?

A

2001

71
Q

What are the two strategies for organising DNA sequences?

A

Hierarchical
Shotgun

72
Q

What is Hierarchical strategy?

A

Start with starter genomes and break them into larger fragments. These are put in a bacteria with artifical chromosome. You can make a linkage map of the large genome to understand where each large fragment relates to each other. You can them shotgun each large fragment and rebuild them to understand the sequence of the large fragments

73
Q

What is the shotgun strategy?

A

You get your DNA strands that you are investigating and break them down into large numbers of overlapping numbers. You look at where they all overlap an after matching large numbers of them up multiple times you can recreate the overall finished DNA sequence.

74
Q

What is the disadvantage of using the shotgun strategy?

A

It requires a large amount of computing power and in the early days it wasnt viable.

75
Q

What are linkage maps?

A

Maps that define the order of DNA markers along the chromosome

76
Q

Where does the DNA from the hierarchical structure come from to form the linkage map?

A

Amplified genomes are sheared into large chucks (50-200kb) and clone into a bacterial host to make a bacterial artificial chromosome (BACs)
The genomic chunks are sheared randomly so will have overlapping ends

77
Q

How can the orientiation of BACs be determined?

A

They can be worked out by looking at overlapping sequence tag sites (STSs) or restriction sites

78
Q

What are STSs?

A

STSs are short regions whose exact sequence is unique in the genome

79
Q

What are restriction sites?

A

They are short sequences that bind to a given restriction enzyme

80
Q

What happens when a physical map of the chromosome is established?

A

The BAC libraries can be prepared for shotgun sequencing to get the final sequence

81
Q

What was the impact of Next Gen sequencing?

A

Increasing scale of data output per run
Introduction of new sequencing platforms

82
Q

When did Illumina start?

A

It started as blue skies research in the department of chemistry at the Univeristy of Cambridge

83
Q

What sparked the idea which lead to major productivity increases?

A

Disscussions in 1997 sparked ideas of using clonal arrays and massively parallel sequencing of short reads with solid phase sequencing by reversible terminators

84
Q

When was Solexa formed?

A

Solexa was formed in 1998 for R&D

85
Q

How did they increase fidelity and accuracy of gene calling?

A

In 2004 they bought into molecular clustering technology and found that by amplifying of single DNA molecules into clusters enhanced fidelity and accuracy

86
Q

In 2005 how many bases could they sequence in a single run?

A

In 2005 they were able to sequence the complete genome of a bacteriophage they were able to deliver 3 million bases on a single run

87
Q

When was the first solexa launched?

A

In 2006 the first Solexa sequencer was launched - 10 GB in a single run

88
Q

When was Solexa bought out by Illumina?

A

In 2007 Illumina bought out Solexa- sequencing technology had outpaced Moores law more than doubling in output each year

89
Q

What is the process for library prep?

A

Target DNA is randomly sheared c.500 bps with no overhangs (transposases are used by Illumina)
Ligase an A tail onto the 3’ end
A T adapter is then added on to the sequence joining the A tail
This is repeated on the 5’ end
You can add two different indices on the adapters, this allows you to multiplex as the different indices can be pulled apart

90
Q

What happens to DNA created in the library prep?

A

You will put it onto a flow cell, this essentially copies your bit of DNA multiple times because the more copies you have the more likely to spot an error later on.

91
Q

What happens to the DNA binded to the flow cell?

A

Polymerases then duplicate this piece of DNA so its now bound to the flow cell. It then forms a bridge structure. The forward strand curls over latching onto the reverse strand primers. Polymerase works its way down, meaning you have a forward and reverse copy of the target DNA. This process happens lots of time. creating a custering effect with the large amounts of DNA formed.

92
Q

What happens to the forward and reverse strand on the flow cell?

A

The reverse strand is washed away. A blocker around preventing the forming of bridges between these molecules

93
Q

How does the forward sequence on the flow cell get sequenced?

A

A sequence primer ligates onto the loose end of the target DNA.
Rather than PCR there are lots of free floating nucleotides that have fluoresce attatched to them.
When they bind to the target DNA they fluoresce which is read by a tiny computer in real time
Each different nucleotide has a different colour.
All the clonal strands are read by the computer at the same time requiring 24 to 48 hours
The index is read back allowing you to create an idea where those strands come from

94
Q

What happens when the forward strand is sequenced?

A

The reverse strand is sequenced

95
Q

How is the reverse strand sequenced?

A

The bridge like structure is formed again and the Index 2 is read
Polymerases create the reverse strand and the forward strand is washed away
The same process occurs again allowing for the reading the reverse strands sequence

96
Q

What happens when both strands are fully sequenced?

A

You will get a series of DNA fragment sequences which using analytical methods the fragments can group similar fragments together and aligned correctly for a fairly decent size of genome

97
Q

What is the overview for Illumina NovaSeq?

A

Runs over 13-44 hours
Produces up to 20 billion reads per run
Max read length is 2x 250bps
Illumina chemistry is variable but around 0.1% error rate
You can multiplex samples (run several samples at the same time)

98
Q

What are the third generation sequencing?

A

Long range sequencing
Two main providers include Pacific Biosciences (PacBio) and Oxford nanopore
PacBio was founded in 2004- first sequencing products released in 2010
Oxford Nanopore founded in 2005- first product MinION released in 2015

99
Q

How does PacBio work?

A

Target DNA strand is circularised by placing adapter on each end and a polymerase is added to one end.
Within the sequel cell this is a smart cell, this is covered with lots of tiny little pores in which a single DNA molecule and its polymerase is added
The polymerase wil bind the opposing strand with a fluorescent everytime a nucleotide is added

100
Q

What is the advantage of using PacBio?

A

The circularisation of the DNA means you can do much bigger bits as the DNA is more stable. This means long range sequencing is possible

101
Q

How does MinION work?

A

They created a pore like structure. Around the pore there is a small electric charge and everytime a molecule enters it slightly disrupts the electrochemical charge. By reading the changes to that charge they are able to deduce what nucleotide has passed through the pore at that time

102
Q

How was MinION revolutionary?

A

It is the first sequencer that can be taken into the field. It can be plugged into your laptop with the USB connector allowing you to sequence anywhere

103
Q

What is the overview of PacBio?

A

Typically 5-60Kb length reads (can be over 100Kb)
13-15% error in normal sequal cells
However with HiFi more like 0.1%

104
Q

What is the overview of Nanopore?

A

Typically 10-30Kb in length
Record of 2.3Mb
1% error rate but can be smaller

105
Q

What is the advantage of long reads?

A

Easier to understand repetitve elements (transposable elements, tandem elements)
Centromeres are easier to understand (they are highly repetitive)
Paraloges (from genome duplication)
All of those make genome assembley very hard when there is no reference genome but long reads push through repeats regions and give evidence for placement of duplicate genes

106
Q

What are the application of long reads?

A

Specific loci in the genome
Looking at envrionmenal DNA
Metagenomics
Resequencing (population wide)
Large genomic chunks
Whole genome sequencing

107
Q

What is needed when deciding a sequencing strategy?

A

Application
Sample size
Depth/ breadth of coverage needed
Is there already a reference? (Are you working with a model species)

108
Q

What strategy can be used when looking at small single loci in a few samples?

A

Sanger is more than likely fine

109
Q

What strategy is bets when looking at metagenomics, eDNAm a single or multi loci sample?

A

Massively parallel short-range sequencing (Illumina)

110
Q

What strategy can be used if you want to resequence a genome across a population if there is a reference?

A

Illumina can be fine

111
Q

What strategy is used with whole genome sequencing without a reference?

A

It gets messy so long range techniques are best

112
Q

When was the first telomere to telomere human genome published?

A

2021

113
Q

When scientists did a sequencing of bird of paradise what did they use?

A

5 sequencing techiniques with 4 DNA assembilies