Transcriptional control of gene expression Flashcards
(58 cards)
DNA structure in solution
Right-handed, 10.5bp per turn, close to B-form (grooves roughly equal depth, bp sit on helical axis). Bases attach to 2’deoxyribose by N-glycosidic bond, exist primarily in amino+ keto tautomer forms. Helps transcription start by allowing seq-specific recognition of dsDNA.
DNA structure in solution stability determined by
- base pairing(4 Watson-Crick base pairings isomorphic, consistent with DNA’s uniform structure- H bonding->stability and specificity),
- stacking (stability via hydrophobic effect (entropy) and favourable electrostatic/vdW interactions. Stacking maximised by propellor twist of 16-18o. Favours extended DNA conformation as bending reduces stacking. Optimal in 5’-purine-pyrimidine-3’ bp steps (e.g., 5’-G-C-3’), hence sequences like TATA most flexible as minimal H bonding and minimal stacking.
- electrostatic repulsion between -ve phosphates. Inter-strand destabilises double helix (can be reduced with higher cation concentration). Intra-strand favours extended conformation (countered by bound proteins that facilitate bending. Follows Coulomb’s law (F=k(q1xq2/r2))
B-form varies in structure (variable geometry of base-pair steps) (shown by X-ray):
Roll (opening along bp axis +20/-10o); Twist (rotation per base pair ~36o); slide (displacement along bp-axis +2 to -1 A; High propellor twist limits degree of slide)
Also seq dependent variation in dimension of groove: minor groove narrower and major groove wider in AT rich seq.
Protein-DNA recognition
initial docking can involve local variation w/ seq-dependent twist/roll/slide. True specific recognition-> sampling bases in the grooves.
Major and minor grooves formed by base pair displacement from helical axis: 120/240o angle between glycosidic bonds in each bp (not rotationally symmetrical around long axis of helix)-> groove width. Extent of bp displacement from helical axis-> groove depth.
Major groove accessible to amino acid chains and info-rich: each bp has unique profile of H-bond donors and acceptors, methyls and non-polar Hs.
Other DNA forms
B ID’d when prepared under high humidity and A under low humidity for X-ray.
* Z form can occur with purine-pyrimidine repeats (CGCGCG e.g.), left handed, alternating anti/syn positioning creates zigzag (purine N9-C1’ bonds syn)- creates torsional stress, bases not planar. Dynamic and less stable than B, more energetically costly to maintain in physio conditions. Major groove v shallow, wider, fewer specific interactions.
* A form right handed, broader and more compact than B, 11 bp/turn, 0.26nm rise/bp, major groove v. deep and much more narrow, inaccessible to amino acid sidechains, minor groove shallow and info-poor (helical axis in major groove). W-C bp RNA has A form- hard to recognise, cellular RNA rarely found like this. In mammalian cells, long dsRNA often recognised as foreign, triggers interferon response.
RNA special features
Can have non-canonical bps, e.g., G=U wobble bp. 27 different bps involving 2 H bonds (+GC w/ 3H bonds)- many observed in r/tRNAs, riboswitches, ribozymes, spliceosomes, etc, allow more versatile conformations such as G quadruplex structures. Remember only WC bps are isomorphic.
Special features: structured RNA can be catalytic (e.g., ribosome, spliceosome) or bind small molecule ligands (riboswitches- can have 1 of 2 conformations depending on bound ligand); Mg2+/proteins often needed for tertiary fold+ catalytic activity; 2o structure formation usually via W-C (canonical) bps; X-ray structures show numerous structural motifs involved in tertiary folds.
RNA structural motifs (5)
- Base-triples, e.g., one A binds U on W-C face and another U on Hoogsteen face (also found on G) to form U:A:U triple. Base triples-> triple helices, e.g., expression and nuclear retention elements (ENE) in viral RNAs+ cellular non-coding RNAs. 3’ A-tract forms triple helix with 2 internal U tracts- v. stable. ENE stabilises RNA+ cause nuclear retention (no export to cytoplasm)
- Pseudoknots: base pairing of loop seq w/ complementary seq outside the stem closing the loop, stabilised by co-axial stacking of 2 helices
- Complex 3D folds, similar to globular protein, made of multiple stem-loops, e.g., bacterial 16S rRNA.
- Helix-turn-helix(HTH): common for seq specific binding for DNA binding dimers/tetramers; usually recognise “half sites” (inverted repeats) with 1 turn separation, bind as homodimers; recognition helix R fits major groove seq-specifically, Stabilisation/positioning helix P increases affinity+ stabilises R helix, sits across major groove. Helices at right angles to cover all angles of DNA spiral.
- Handshake motif
RNA recognition
by non-bp paired regions (seq-specific) or by shape. Fully WC bp dsRNA forms uniform A-form helix: major groove inaccessible, long fully dsRNA usually recognised as foreign.
Investigating gene expression mechanisms by…
clone+ seq gene(can use genomic clone (for studying transcription/splicing) or cDNA clone (to study translation); assay system in vivo or vitro; ID cis-acting seq (essential, typically act as binding site for trans-acting factors (protein or RNA) by finding consensus seqs/ investigating effects of mutations; ID trans-acting factors; investigate cis+ trans combine to ctrl f(x).
Assays in vivo vs in vitro
can be in vivo (physiological conditions but little ctrl over variables, hard to monitor reaction intermediates, can be unphysiological if test gene/expressed RNA too abundant) or in vitro (precise variable ctrl, can detect activities then purify trans factors, often inefficient, unphysiological). Also in silico approaches (deep seq, new hypothesis generation for testing in vivo or vitro).
Prokaryotic transcription general points
RNA transcribed from template/antisense/non-coding strand. Initiation: rate-limiting as no primer, promoter specifies TSS, consists of cis elements, usually shortly upstream on TSS+1, recognised by RNA Pol+/ TFs. Most RNAs start w/C/G, then synthesis/elongation 5-3’ using NTP substrate (NMP incorporated, PPi released) ~45nt/s, error rate 1/10000.
NB: NAD+ cap stabilises prokaryotic mRNA (is a mechanism for exp ctrl): @ some promoters with TSS+1=A, NAD+ can be incorporated at +1 by RNAP. Contains ADP-ribose-nicotinamide, only occurs at some promoters (seq of promoter is important)
Bacterial core promoter elements
Asymmetric. In downstream order: Up element at highly active rRNA promoters~20 bp AT rich element recognised by C-terminal domain of alpha subunits (bind via minor groove); -35 box (TTGACA) recognised by sigma region 4, down mutations inhibit initial RNAP binding; -10/pribnow box (TATAATG) melted non-template strand recognised by sigma region 2, down mutation inhibits promoter melting, optimal distance from -35 box 16-18bp.
Bacterial RNAP
single core (alphax2+omega (assembly), beta+ beta’ (active site) transcribes all m/r/tRNA. More abundant than sigma subunits (promoter recognition), so both core+ holoenzyme present.
Alpha N-terminal and C-terminal domains connected by flexible linker. Active site in cleft at base of beta, beta’ claw-like pincers: downstream DNA enters cleft between pincers (mobile, tightly binds ~20 bp of downstream DNA when RNAP transcribing. Beta=flap, beta’= upper jaw)
Core binds non-specifically (“loose”), non-specific initiation from multiple locations on both strands. Holoenzyme non-specific binding reduced 1000-10000-fold, specific promoter affinity increased 1000 fold, accurate initiation.
Bacterial RNAP sigma subunit domains
2+4 have helix-turn-helix motifs, recognise -10+-35;
Domain 1.1=-ve DNA mimic, suppressed inappropriate DNA binding by free sigma (interacts with sigma 4, stops DNA binding) and holoenzyme (occupies downstream DNA binding cleft, reducing non-specific binding, displaced upon promoter binding).
3.2 linker region occupies RNA exit channel in holoenzyme beneath flap
Sequence during initiation of prokaryotic transcription
Holoenzyme binds at -35 box (reversible) promoter binding ejects sigma1.1, pincers clamp around downstream DNA->
* closed binary complex (-55 to -10)-> promoter melting (irreversible) at -10 box from -11 to -3 begins with base flipping, where bases A-11 and T-7 flipped out without ATP use into specific binding pockets on sigma2 to form 14nt bubble (detected by KMnO4 probing)-> open binary complex (-55 to +20)->
* Initial transcribing complex (first NTP binds with low affinity by bp to template, subsequent NTPs bind with 10x affinity due to bp and stacking interactions. Phosphodiester bonds form 2-9 nts, aligned by RNAP active site, and initiation bubble expands to 23nts in cycles of abortive initiation where short RNA transcripts are released. Productive transcription (sigma 3.2 still blocks RNA exit channel), then promoter escape after 8-9nt:
* Sigma released, bubble collapses from 5’end to 14nt to form ternary elongation complex (35bp footprint)- highly stable and processive. Sigma 3.2 still blocks RNA exit channel.
NB: promoter escape not easy due to: sigma2/-10 and sigma4/-35 (and optional alpha-CTD) contacts with DNA, 3.2 linker/RNA exit channel and sigma4/beta and sigma2+sigma3/beta+beta’ contacts with RNAP need to be broken. New interactions stabilise elongation complex: 9bp RNA:DNA complex, clamp around 20nt upstream DNA, ssRNA (6-10nt) contact with exit channel.
IDing and characterising promoters:
- For specific bases/short seqs: Find consensus seqs by alignment to TSS e.g., -10/Pribnow box or -35 box (closer match= stronger promoter- permanent up/down tuning)
- Examine natural promoter mutations- affect mRNA quantity but not seq. Can be UP/DOWN, latter more common, UP more likely for weaker promoters.
- Generate targeted mutations guided by info on consensus seq.
For ID of more extensive regions of DNA, biochemical mapping: DNA melting
KMnO4oxidation of pyrimidines in ssDNA regions. KMnO4 reacts preferentially with unpaired thymine, oxidizing C5-C6 double bond+ adding OH to both Cs-> add alkali to cleave phosphodiester backbone @modified positions. Alternative: block primer extension reaction to monitor position of modified Ts. Sensitivity to KMnO4 helps monitor where DNA unwinds (e.g., promoter melting during initiation)
Biochemical mapping of DNA-protein interactions (also for RNA-protein):
EMSA/ gel-shift assay (in vitro) procedure
mix end-labelled DNA with pure protein/cell extract, run native gel, then image. Seq specificity demonstrated by titrating in xs unlabelled DNA of same/random/mutated seq. Variant on this: supershift, for when you think you know the binding protein (DNA probe incubated w/ cell extract, test for candidate protein with antibodies- DNA:antibody:protein complex migrates even slower than DNA:protein complex on gel).
Footprinting (in vitro
if and where protein binds. one strand of DNA end-labelled, incubate with and without protein, digest mildly (single hit)- protein protects DNA where bound. Run high-res denaturing urea gel, image and compare gels with+ without protein (gaps in protein gel where protein bound)
Modification interference
Modification interference: find where chemical modification of DNA prevents protein binding. 1 strand of DNA end-labelled; DNA modified (average 1 mod/strand w/ ENU (phosphates) or DMS (methylates purines)); incubate with protein; separate bound+ unbound DNA (e.g., EMSA); purify DNA from both modified DNA and post-incubation DNA and cleave at modified sites; run urea gel: missing bands where modification prevented protein binding and therefore protecting DNA on DNA bound by protein compared to input DNA.
ChIP-seq
ChIP(-seq) (chromatin immunoprecipitation): which DNA sites specific proteins bind to in vivo, mostly used for eukaryotes (TFs, RNAPs, histone post-translational mods). Proteins bind genomic DNA in vivo; in vivo crosslinking with formaldehyde, then cell lysis and DNA fragmentation into ~200bp fragments (sonication or MNase); immunoprecipitation with antibody for query protein; reverse crosslinks (heat), purify DNA; NGS seq and map assembly, then seq analysis and motif ID (computational) to ~100bp resolution (low res but genome-wide).
Prokaryotic Transcription Regulation
Achieved in 2 ways: classes of genes can be co-ordinately ctrled by switching sigma factors (7 types in E coli), e.g., by heat shock.:
Sigma70=general, TTGACA and TATAAT -35 and -10 with 16-18bp separation.
SigmaN=nitrogen starvation, CTGGNA -20 box and TTGCA -10 box with 6bp separation.
Regulation by activator/repressor proteins: induction by small-molecule substrate inducer (metabolising enzymes switched on). Repression by availability of a nutrient co-repressor (biosynthetic enzymes switched off). Inducers and co-repressors are allosteric regulators or activator/repressor proteins (bind site remote to DNA binding site), which have 2 conformations: 1 stabilised by inducer/co-repressor and 1 binds DNA with higher affinity. Often bind as dimer/tetramer to palindromic binding sites
Lac operon general rules/qualities
Lac operon shows +ve and -ve regulation in response to lactose/glucose presence to reduce waste of biosynthetic capacity by unnecessary enzyme production. LacZ, Y+A fully switched on by and gate logic: Active when lactose+ and glucose-(cAMP elevated).
Low basal levels of transcription (leaky promoter)-> enough LacY product (permease) to allow Lac uptake into cell when available. Side reaction of beta-galactosidase-> allolactose (inducer that binds Lac repressor).
Regulated promoter has suboptimal -10 and -35-> max levels of transcription require CAP protein (activator). Binding site for CAP upstream of -35 but not overlapping, centre of the two (A/T)GTGA half sites are 10bp/1 helical turn apart.
Upstream of -35 not AT rich, not a great binding site for CTD binding (no UP element).
Overall, Lac -vely regulated by Lac repressor (bound when no lactose, released by inducer (lactose) binding) and +vely regulated by CAP (binds DNA when cAMP bound, promotes DNA binding- note that cAMP accumulates in glucose absence). If repressor and activator both not bound, transcription only at around 2% or max bc RNAP binding v. weak (no UP)
In vivo assay evidence for Lac operon regulation
When repressor mutated, lac active whenever lactose present;
When operator mutated (repressor binds weakly), more leakage when both lactose and glucose present;
When CAP mutated, lac activity lowered when lactose present; when -35 mutated, lac activity lowered regardless of conditions