Molecular genetics 13-18 Flashcards

(146 cards)

1
Q

What is a gene?

A

A DNA sequence (or RNA in some viruses) that is transcribed into RNA along with all the sequences to control its expression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Features of prokaryotic genes

A
  • No nucleus
  • Usually circular dsDNA
  • Gene’s in operons (several open reading frames encoded from one mRNA)
  • Simple regulation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an example of regulation by inhibition in prokaryotes? How many proteins are involved?

A

Trp operon for tryptophan biosynthesis - linear pathway with 5 different proteins carrying out three different enzymatic reactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What happens in the linear pathway producing tryptophan when there is a lot of tryptophan present?

A

The tryptophan inhibits the production of the second enzyme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is feedback inhibition example in the tryptophan biosynthesis pathway?

A

Accumulation of tryptophan slows down the rate of catalysis of the first enzyme complex (trp D/E), so reduces the overall rate of production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is transcriptional regulation?

A

The presence of high tryptophan concentration reduces transcription of the operon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is it important that tryptophan concentrations are not allowed to get too high?

A

Tryptophan is a toxin in high concentrations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When tryptophan is low/absent:

A
  • trpR (trp repressor protein) inactive as no tryptophan
  • trpR can’t bind to the operon promoter
  • transcription not blocked
  • trp operon is expressed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When tryptophan is present:

A
  • trpR is constitutively expressed
  • trpR protein binds to tryptophan (co-repressor) and forms an active repressor
  • Active repressor blocks transcription of trp operon
  • No pathway expression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an example of an inducible operon in prokaryotes?

A

The lac operon - contains genes that code for enzymes used in the hydrolysis and metabolism of lactose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is the lac repressor active or inactive by itself?

A

Active

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What inactivates the lac repressor?

A

A molecule called an inducer (lacI)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is lacY responsible for producing?

A

Lactose permease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is lacZ responsible for producing?

A

B-galactosidase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is lacA responsible for producing?

A

Acetyl transferase

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When lactose is absent:

A

The lac repressor is active and switches the lac operon off

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When lactose is present:

A

The repressor is inactive as it forms a complex with allolactose (inducer), preventing repression and allowing expression of the genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Is the lac repressor usually completely inactive?

A

No, often there is not enough lacI for complete repression, so there is leaky expression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is lacIq?

A

A mutation in the lacI promoter region causing increased transcription and so higher levels of lacI protein, so the lacZ/Y/A promoter is more strongly repressed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is lacI

A

The regulatory gene responsible for producing the protein that represses the lac operon from being transcribed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the other condition needed for the breakdown of lactose, other than lactose being present?

A

Only occurs if glucose is absent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What further regulation is needed so the lac operon is only transcribed if glucose is absent?

A

Carbon catabolism regulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the link between cAMP levels and glucose levels?

A

Cyclic AMP is present in low levels if glucose concentrations are high.
Cyclic AMP is present in high levels if glucose concentrations are low

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does CRP stand for?

A

Cyclic AMP Receptor Protein

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
How does CRP affect the transcription of the lac operon?
When cAMP accumulates in low glucose levels it binds to and activated the CRP protein. Active CRP helps bind RNA polymerase to bind to the promoter to cause it to transcribe the protein.
26
What are the ideal conditions for the lac operon to be transcribed?
Lactose present | Glucose absent
27
What are sigma factors?
Transcription activators that enable specific binding of RNA polymerase to gene promoters
28
Why is the lac operon repressed by default?
Lactose may only rarely be present, and is a second-choice carbon source
29
What do sigma factors do?
Help RNA polymerase to bind to promoters Dictate the transcription start? Activate/amplify transcription Housekeeping rpod sigma-70 for constitutive genes
30
What is a housekeeping gene?
A constitutive gene that is transcribed at a relatively constant level, required for the maintenance of basic cellular function and expressed in all cells of an organism under normal conditions
31
What is a constitutive gene?
A gene that is transcribed continually as opposed to a facultative gene, which is only transcribed when needed
32
What are pathway-specific sigma factors?
They activate gene families for effective expression
33
What genes are activated by a specific sigma factor when a bacteria is given a heat shock?
``` rpoH: heat-shock genes fecI: iron uptake rpoS: starvation/stationary phase rpoN: nitrogen starvation rpoF: flagellar genes for motility ```
34
What is Quorum sensing?
For coordinating gene expression between individuals - bacteria communicate using chemicals
35
What is AHL?
Acyl homoserine lactone signal
36
How does Quorum sensing work?
- Constitutively produce AHL - When concentration is high, receptor protein activated, switches on transcription of all virulence genes - This plays a major role in disease
37
What are LuxR-type R proteins?
Involved in producing luciferin - easy to measure as visible to scientists
38
When does translation start in bacteria?
Before transcription has finished
39
What is polarity in bacterial expression?
Usually more protein of the first ORF made than the later ORFs for operons, due to translation starting before transcription has finished
40
What eukaryotic gene properties are not shared with bacteria?
- Chromatin - mRNA processing: introns, 5’ cap, 3’ poly A tail - Transport of mRNA out of nucleus - Uncoupled transcription and translation - miRNA/silencing
41
What factors change the rate of overall expression?
1) Rate of transcription 2) Rate of mRNA degradation 3) Rate of translation 4) Rate of protein degradation 5) Chromatin accessible?
42
What features affect the rate of transcription?
- Each ORF has its own promoter - Genes usually not clustered by function/pathway - Often in different chromosomes - Eukaryotic genomes have an enhancer (upstream, downstream or within the coding region) - PolyA tail dictates how far back mRNA gets processed - Promoter elements can be immediately or several thousand based downstream of the gene - Repressors can block in various places along the DNA strand, blocking transcription - Activators are expressed when repressors are not present. The activators bind to the enhancer region, recruit DNA bending, recruit general Tsc transcription factors and recruit mediators
43
What is the polyA tail also known as?
The ‘terminator region’
44
Is the polyA tail encoded in the genome?
No, it is added enzymatically
45
What are general transcription factors?
Essential for the transcription of all protein-coding genes
46
What are specific transcription factors?
High levels of transcription of particular genes depend on control elements interacting with specific transcription factors
47
What are wide domain areas that specific transcription factors can affect?
Carbon, nitrogen, pH
48
What are narrow-domain areas that specific transcription factors can affect?
Specific metabolic pathways
49
What prevents mRNA from degradation?
5’ cap and 3’ polyA tail stabilise the DNA reducing degradation 3’ polyA tail helps transport the mRNA out of the nucleus 5’ UnTranslated Region (UTR) and 3’ UTR help define stability
50
Why is it important to have stable RNA?
More stable RNA gets translated more
51
What is miRNA?
Micro RNA Short, non-coding RNA molecules Bind to specific mRNA (complimentary) Recruit RNA endonuclease enzymes Digest specific mRNA (or several related mRNA) Part of the gene silencing pathway using DICER and RISC
52
How is the rate of translation altered through initiation of translation?
Different mRNAs have different 5’ UnTranslated Regions (UTRs) before the sequence. These different UTRs have different affinities for ribosome binding and cause either high or low levels of translation to occur
53
What is a Kozak sequence?
A varying sequence around the start codon which plays a major role in the initiation of translation. Certain bases are more likely to appear in the sequence and lead to a higher rate of translation, such as A or G at the -3 position and C at the -1 position
54
What does translation rate depend on?
Ribosome binding Translation enhancers Codon usage
55
What is codon usage?
The use of genetic redundancy to allow the control of translation. Different codons for the same amino acid are used in different frequencies. For optimal expression use the common codons and avoid the rare ones
56
How are proteins targeting for destruction?
They are linked with Ubiquitin (Ubiquitous protein) Cross-link to targeted protein - many ubiquitous needed Directs its movement to proteasome (protein complex involved in hydrolysis of a protein) Triggers it’s digestion Ubiquitin gets recycled
57
What can make DNA inaccessible for transcription?
Condensed chromatin
58
What is histone acetylation?
Acetyl groups are attached to an amino acid in a histone tail. This appears to open up the chromatin structure, thereby promoting the initiation of transcription
59
What is histone methylation?
Adding methyl groups to amino acids. This can condense chromatin and reduce transcription
60
How does acetylation of lysine residues affect transcription?
Causes chromatin to be looser, better transcription
61
How does methylation of histones affect transcription?
DNA more condensed, less transcription
62
What are the group of enzymes that acetylate lysine amino acids?
Histone acetyltransferases (HATs)
63
What are the class of enzymes which remove acetyl groups from lysine?
Histone deacetylases (HDACs)
64
What are the group of enzymes which methylate histones?
Histone methyl transferases
65
What are the group of enzymes which remove methyl groups from histones?
Demethylases
66
What can happen when DNA methylation goes wrong?
Cancer
67
What is epigenetics?
Heritable inactivation if genes
68
What does SAHA stand for and what does it do?
suberoylanilide hydroxamic acid (SAHA) | Inhibits HDAC - chromatin stays acetylated longer so maintains expression
69
What does 5AC stand for and what does it do?
5 azacytidine | Inhibits histone methyl transferase, leaving the DNA less condensed so maintaining expression
70
How to prevent RNA from being degraded by RNAses as you work
- RNAse-free solutions and disposables - Work clean, fast (if exposed to enzymes, no time for degradation) and cold (enzymes not at optimum temperature) - RNAse-drew DNAse to remove any remaining DNA - Chaotrophic salts - disrupt protein structure so RNAse enzymes not active - Wear gloves
71
What does TOTAL RNA include?
mRNA, rRNA, tRNA, snRNA, miRNA etc.
72
What is a Southern Blot?
Run DNA on gel, shoes size of fragment
73
What is a Northern Blot?
Run RNA on gel, blot and probe for gene of interest, because no discrete lines produced on gel, just a smear. Shows size and abundance of fragment (although can only run one known gene at a time)
74
How to separate the mRNA from all the other RNAs?
It has a polyA tail (AAAA) Add beads of oligo(dT) (TTTTT sequence) to the RNA mixture, mRNA will stick to the beads whereas others will not. The mRNA can be eluted into different salt concentration solution as this will change its affinity
75
What is the name of the process which converts mRNA to cDNA?
Reverse transcription
76
Where do you find reverse transcriptase activity?
In retroviruses
77
How to convert mRNA to cDNA (copyDNA)
Prime the mRNA with oligo(dT) which will bind with polyA tail and the entire population of mRNA will undergo reverse transcription
78
How is the cDNA cloned after it is produced?
Using adapters
79
What is an example of how the cDNA is modified after it has been cloned?
Using site-directed mutagenesis
80
How can you amplify RNA?
You can’t use PCR to directly amplify DNA First the mRNA population must undergo reverse transcription into cDNA, then PCR can amplify the DNA of interest using gene-specific primers
81
What is qPCR?
Quantitative PCR | Measures the amount of product per cycle
82
What dye is usually used for qPCR?
SybrGreen - fluoresces when bound to dsDNA | Fluorescence measured after each cycle of amplification
83
What is the qPCR expressed as?
2^(DeltaCT)
84
More effective version of Northern Blot - testing all genes at once - RNA quantification by hybridisation of an array
Put bits of every gene on a support and probe with labelled RNA Chips are hybridised to the labelled transcripts Signal from each spot is measured Shows transcript abundance for every gene on the array
85
What is the sequencing-based method that has superceded RNA quantification by hybridisation
Prepare cDNA from chosen condition Sequence lots of individual molecules Assess what is expressed and in what abundance
86
cDNA cloning - the old fashioned method
``` Clone cDNA into plasmids, sequence it clone by clone Called ESTs (Expressed Sequence Tags) which represent portions of expressed genes ```
87
What type of sequencing is used to sequence RNA directly?
Next-generation sequencing
88
Features of GFP
- From jellyfish Aequorea victoria - Simple barrel shaped protein - Excited by 385 or 480nm - Emits at 509nm - Needs no other substrate except oxygen
89
What is a reporter gene?
A gene that researchers attach to a regulatory sequence of another gene, to determine its rate of expression
90
Examples of commonly used reporter genes
B galactosidase (lacZ) Glucuronidase (gusA) Luciferase (luc) Green fluorescent protein (GFP)
91
What is promoter bashing?
Analysis of possible control elements by deletion
92
How to determine where a protein will go?
Sequence tags within peptides target proteins to specific organelles “Leader sequence” directs protein - N terminus first 15-30 amino acids direct protein to secretion machinery To get protein to the ER there is a KDEL/HDEL sequence at the C terminus To get protein to the nucleus there are 5 positive basic residues
93
What is a Western Blot?
Using a specific antibody to quantify a specific PROTEIN
94
What is SDS?
Sodium dodecyl sulphate A strong detergent that denatures proteins so they are linear They can then undergo electrophoresis
95
What is Poly Acrylamide Gel Electrophoresis?
Separates proteins by molecular weight | Different mobility if modified by glycosylation, phosphorylation and acetylation
96
What are the limitations of Poly Acrylamide Gel Electrophoresis?
Only suitable for soluble proteins Cysteine-cysteine bonds may require reducing Limited by antibody availability
97
What is another way of separating proteins by size that is not Western Blotting? (1)
SDS-PAGE gel Separate proteins by pH gradient with electric charge across it. Protein will move until it is at a pH where it has no charge - depends on size Uses isoelectric focussing
98
What is another way of separating proteins by size that is not Western Blotting? (2)
Protein Mass Spectrometry Separate proteins by size or hydrophobicity through gel electrophoresis Feed into mass spectrometer Find accurate mass of each protein (to 5 dp) Determine its sequence identity from its mass
99
Why would you add 6 histidine to the start or end of a protein?
To use them as a hook to purify specifically this protein The 6 histidine tag binds to Zn+ This acts as an EPITOPE
100
What is an epitope?
The pet of an antigen molecule to which an antibody attaches itself to
101
What are the benefits of sequencing a genome?
-Co-located genes may form pathways -Compare genomes and see different mutations Identify candidate genes close to a genetic marker associated with a trait
102
When was the first genome sequenced and by who? How many bases did it have?
In 1977 by Sanger and his colleagues. | It consisted of 5375 nucleotides
103
What organism was the first to have its genome sequenced?
Phage phi X 174 (bacteriophage)
104
How many genes could be sequenced per year by one person in the early 1990s?
10 genes | One person could only sequence 1500 bases per day
105
When was Sanger sequencing developed and how many bases could then be sequenced per day?
Late 1990s | 240,000 bases per day
106
Features of the Human Genome Sequencing Project
1990-2003 3.5 billion bases Cost more than $3 billion Factory-scale sequencing
107
What type of sequencing was developed in 2006 and how many bases can it sequence per run?
Next generation sequencing | Can sequence 1000 billion bases per run
108
What is the other name for next generation sequencing?
Illumina sequencing
109
Outline the traditional genome sequencing approach (What is its other name?)
c2001 Called hierarchical shotgun sequencing 1. Genome DNA cut into large fragments, producing a BAC library (each 300 kb fragment like small extra genome) 2. Using radioactive hybridisation of these clones they are organised into large clone contigs. You can work out which fragments overlap with each other 3. Select the BAC to be sequenced 4. Break up selected BAC into smaller pieces - a ‘shotgun clone’ 5. Reassemble sub-fragments back into order by working out the sequence of each shotgun clone and which other sequences it overlaps with
110
BAC meaning?
Bacterial Artificial Chromosome
111
What is a contig?
A set of overlapping DNA segments that together represent a consensus region of DNA
112
What is the downside of hierarchical shotgun sequencing (traditional sequencing approach)
Highly labour intensive
113
Outline next generation genome sequencing
1. Fragment genome - sonicate into random overlapping fragments 2. Sequence fragments and assemble 3. There may be gaps with low coverage, but 99.9% high coverage
114
What is K-mer based assembly?
All the sequences created in next generation sequencing must then have their ends sequenced to see if there is any overlap between sequences. Illumina sequences the 100 bases on either end of the sequence to find overlaps. It would take to long to compare all 100 bases from the end of one sequence to the 100 from all the other sequences present, so computer breaks up these sequences into smaller fragments. The computer puts each fragment in a particular memory address and finds overlaps between short fragments going all the way along the 100 bases. E.g. k=25 looks for overlaps of k-1=24.
115
The problem of repeats in shotgun sequencing in eukaryotes
Applied particularly to next generation sequencing as there are no large BAC clones to help order sequences. The repeats between genes can be so long that the computer does not know what gene comes first after the repeat
116
Solution to the problem of repeats in next generation sequencing (1)
Illumina mate-pair libraries Several kilobases long, can be used to span repeats. Only the ends are sequenced. Difficult to make mate-pairs
117
Solution to the problem of repeats in next generation sequencing (2)
Oxford Nanopore | Sequences are typically several kb in length, and the entire sequence can span a repeat
118
What size can eukaryotic genomes range to?
16 Gb
119
How do we find the open reading frame within a DNA sequence?
Feed the sequence into a computer, which will translate the sequence into 3 possible forward and reverse frames. The gene will be found between a Methionine start amino acid and a STOP codon.
120
What type of DNA does ORF finding work well for and why?
Prokaryotic DNA, as they have no introns or repeats
121
What type of DNA does ORF finding work less well for and why?
Eukaryotic DNA, because genes are interspersed by non-transcribed gaps, repeats and introns. Introns break up coding sequences
122
How can codon usage help identify the real open reading frames?
Some codons are more commonly used to encode a specific amino acid in a gene than others. Reading frames with the more commonly used codons for a particular amino acid are more likely to be found within a coding region, whereas non-coding DNA will use all codons equally
123
What other feature of eukaryotic DNA allows identification of genes?
Eukaryotic genes have conserved splice sites. | Eukaryotic introns tend to start with AGGTAAGT and end with YYYYYYNCAG (Y = pyrimidine C/T, N = any base)
124
How to confirm ORF expression after finding the ORFs
Use RNAseq 1. Extract nucleus acids from sample 2. Use oligo dT to extract mRNA 3. Use Illumina to sequence mRNA 4. Gene expression profiling: use computer to map RNA reads back onto the genome 5. Align RNA to a reference and count expression levels
125
What does BLAST stand for?
Basic Local Alignment Search Tool
126
What is BLAST?
A database containing every known gene that has ever been sequenced
127
How can you use BLAST to find out more about the gene you have identified?
The sequence you have found can be compared to the database and the gene name/function can be worked out through similarities to other known genes
128
What is BLASTN?
A specific type of BLAST tool for comparing DNA sequences with other DNA sequences Query: nucleotide Database: nucleotide
129
What is BLASTX?
A specific type of BLAST tool for comparing a translated nucleotide to known proteins Query: translated nucleotide Database: protein
130
How to build BLAST hits:
1. Start with one word match (a word is 11 nucleotides by default or 3 amino acids) - a ‘seed’ 2. If possible, extend the alignment either side of the word match. If there are no matches, find another seed and start again 3. If there are enough hits to pass the threshold value, return an alignment to the user
131
When interpreting BLAST results, what is an E-value?
The number of matches as good/better than the results expected by chance - smaller sequences likely to have a larger E-value
132
What must you beware when analysing BLAST results?
E-value cut off at 10 E-values greater than 0.00001 are not considered reliable Sequence similarity doesn’t prove functional homology
133
Why is BLASTX so useful?
1. Finding a protein match helps to confirm that the DNA you have sequenced is expressed 2. Matches to proteins can show up possible introns in genomic sequence 3. Protein sequences are more likely to have useful annotation than DNA sequences
134
Why is it beneficial to search within a certain subset of organisms when using BLAST to compare your gene?
Speeds up the search as there are less sequences to search through Increases sensitivity - reduces likelihood that same pattern has been found due you chance
135
What is genomics?
The study of genomes
136
What is metagenomics?
The study of multiple genomes in complex (usually environmental) samples
137
What is the problem with growing microbes in labs?
<1% will grow in culture or they grow really slowly over years Bacteria are also so small that it can be near impossible to determine the species under a microscope
138
In metagenomics, how does one determine what species are present?
Sequence a marker gene
139
What gene is usually sequenced to determine what bacteria and archaea are present and how are the variable regions amplified?
Partial 16s ribosomal RNA gene (16s = 16 Svedberg) V1, V2 and V3 are not conserved so are variable between species There are conserved regions between the variable regions so you can synthesise primers complementary to the conserved regions
140
What is a 16S rRNA pipeline?
1. Extract DNA from sample e.g. blood, faeces, soil, slime, dust etc. 2. Target region of 16S gene which can categorise which bacterial gene is present using forward and reverse primer 3. PCR carried out which amplifies variable regions of all genes present 4. Each sample contains multiple genomes in a complex mixture - amplicon pool
141
How is Illumina sequencing made cost-effective when sequencing an amplicon pool?
Barcoding and multiplexing 1. Add a unique barcode to sample 1 amplicon primers 2. Amplify 16s rRNA from each sample and include a sample specific barcode in the forward primer 3. Can sequence up to 96 barcoded samples can be ran at once 4. Output: 400 million sequences (4 million sequences per sample) which gives a good snapshot of microbial diversity in that sample
142
How to process 16s rRNA data
- Start with millions of sequences - Cluster sequenced together to make OTUs (Operational Taxonomic Unit) - Assign OTUs to: domain/class/order/genus/species using a BLAST search - Different taxonomic levels come back, as not all sequences can be assigned to specific species
143
How to discover what the species in the sample are doing?
Sequence genomic DNA using shotgun sequencing
144
Outline the process of shotgun sequencing to sequence genomic DNA from sample
1. Extract DNA from sample. Each sample contains multiple genomes in a complex mixture 2. Fragment DNA into 500bp fragments and sequence with Illumina 3. The fragmented pieces are assembled into genes (contigs) and the number of sequences in each contig is noted 4. Identify the contigs: BLAST search them to a database of known proteins
145
Why can only 4 DNA samples be multiplexed at once as opposed to 96 RNA samples?
DNA genomes are much larger, and more data is needed to cover multiple whole genomes
146
What is 16s rRNA sequencing useful for?
Taxonomic composition of samples Overall diversity Differences between samples due to factors of interest