Tuesday, March 18, 2008

Testing whether the CRP-S sequence confers Sxy-dependence.

I am currently studying how CRP binds to and activates transcription from promoters with CRP-S sites. These promoters are unique in H. influenzae for two reasons: 1. They depend on the Sxy protein for transcription, and 2. The CRP-binding sites (called “CRP-S”) in these promoters differ at the sequence level from other CRP sites (called “CRP-N”) in the genome.

I previously tested what happens if I mutate a CRP-S site to resemble a CRP-N site. The mutations are deleterious to the promoter but allow for some transcription activation in the absence of Sxy. This result is currently in a manuscript in preparation, but the manuscript suffers from not having the reciprocal experiment of converting a CRP-N (Sxy-independent) site to resemble a CRP-S site. I attempted this experiment twice previously, first by mutating the H. influenzae mglB promoter and then by mutating the E. coli lacZ promoter; both of these promoters have CRP-N sites.

mglB presented all sorts of cloning problems until I discovered that cloning even just the 1st 10 amino acids of the gene was highly toxic to cells – ie. I could not recover plasmids that had not undergone large rearrangements. I did eventually clone and mutate the promoter, but never used it because I couldn’t conduct real time PCR to measure transcription. However, it may be worth going back and cloning the mglB promoter adjacent to lacZ and use beta-galactosidase activity to measure promoter activity.

Next I mutated the CRP site in the lacZ promoter; this was easy because our standard H. influenzae cloning vector, pSU20, already carries the lacZ promoter and lacZa gene. However, it turns out that lacZa is constitutively transcribed at very high levels in H. influenzae, even in a crp- background. Thus, the lacZ promoter is completely CRP-independent in H. influenzae and is useless for my experiments. I then cloned the lacZ promoter mutant in E. coli, but because Sxy induction experiments aren’t very clean in E. coli, the data was never very compelling. Nonetheless, the data did suggest that giving lacZ a CRP-S site reduced its stimulation by CRP, as expected, but transcription was unaffected by the presence/absence of Sxy.

The lacZ promoter mutant may be useful in future when we better fine-tune Sxy activity in E. coli, but in the meantime I really want to answer the question of what happens when a CRP-N site is converted to a CRP-S site in H. influenzae? Today I am planning the steps involved in cloning and mutating the ansB promoter/gene, which has a good CRP-N site and is strongly induced in MIV even in sxy- cells. I will also consider using beta-galactosidase activity instead of real time PCR to measure promoter activity.

Saturday, January 12, 2008

Why CRP-S sites are special

I’m trying to improve my ability to verbally express the significance of our CRP-S work. Here I outline a discussion I had with a former supervisory committee member who asked something along the lines of “Aren’t CRP-S sites simply low affinity CRP sites?” He asked this because many bacterial promoters use regulatory mechanisms that require a transcription factor to bind first to high affinity DNA sites and then, once the good sites are all full, less favourable DNA sites will begin to be occupied.

A classic example is the OmpR system (Figure 1). When only a few OmpR molecules are active in DNA binding (OmpR~P), the high-affinity site in the ompF promoter is bound and ompF is is transcribed. When more OmpR molecules are phosphorylated, low affinity binding sites become occupied, which turns off the ompF promoter and turns on the ompC promoter.

Below I describe the CRP-S model as it is coming to light in H. influenzae. This is a simple model that will improve as we gain a better understanding of Sxy and CRP-S function in E. coli.

Key point: “CRP-S” sites are low-affinity but highly specific CRP sites. CRP binding to CRP-S sites is mechanistically different than for binding to canonical low-affinity sites. Canonical low-affinity sites are low-affinity because CRP has a low-specificity for them, and in general protein affinity and specificity go hand in hand.

Our discovery: CRP sites in E. coli, H. influenzae, and very likely many bacteria, fall into two sub-populations: canonical “CRP-N” sites and unusual “CRP-S” sites. CRP-N and CRP-S are names that Rosie and I coined. Previously studied CRP sites in E. coli belong to the CRP-N group, for example E. coli’s lacZ promoter has a CRP-N site.

Background: E. coli CRP binds as a homodimer, specifically to symmetrical 22 bp DNA sites with the consensus half site 5’-A1A2A3T4G5T6G7A8T9C10T11. The protein makes direct contact with base pairs G:C5, G:C7, and A:T8 in the highly conserved core motif T4G5T6G7A8, and binding induces a localized kink of 43º between positions 6 and 7, wrapping the DNA around CRP and strengthening the association. Though base pair T:A6 is not directly contacted by CRP, it is recognized indirectly because kink formation strongly favours T:A6 over other base pairs. For example, replacement of T:A6 in a consensus CRP site with C:G6 causes an 80-fold reduction in CRP affinity by increasing the free energy required to bend the DNA.

Despite the extensive variation among CRP sites, until our work no significance had been attached to which positions vary. Instead, the degree of similarity of any CRP site to the consensus was proposed to generate an adaptive hierarchy that allows genes with better sites to be preferentially activated at low cAMP concentrations. Similar hierarchies must exist for all DNA-binding proteins – ie. good sites are preferentially bound at low protein concentrations. Many regulatory mechanisms exploit this hierarchy, such as OmpR’s ability to act both as an activator and a repressor. When a sufficient number of OmpR molecules are active in DNA binding, both high- and low-affinity OmpR sites are occupied; the amount of occupancy has a profound influence on the activities of the ompF and ompC promoters.

The CRP-S story is different: CRP-N and CRP-S sites are each best described by a different consensus sequence: CRP-N sites have the core sequence TGTGA whereas CRP-S have TGCGA. DNA sequences in one population differ significantly (in a statistical sense) from sequences in the other population – and we can even detect this by eye.

I have found that CRP-S sites are low-affinity binding sites for CRP in vitro, but in a special way unlike low-affinity CRP-N sites. Low-affinity CRP-N sites have multiple non-consensus bases at key positions; these prevent specific contacts from forming between the protein and DNA site. CRP-S sites are different because in H. influenzae they have all the correct bases required for binding by CRP, except that the non-consensus bases at positions 6 and 17 (see “Background” above) prevent the protein from forming a stable interaction. In other words, all of the more than ten specific contacts between amino acids in CRP and bases and phosphates in a CRP-S site can form, but the DNA site won’t kink because of bases C6 and G17.

Thus, CRP-S are a special, mechanistically distinct, type of low-affinity site: CRP has high specificity for these sites, but critical base substitutions make them poor binding sites. Consequently, CRP-S sites don’t properly fit in the classic hierarchy of high- to low-affinity CRP sites. Instead, CRP-S sites appear to occupy an alternate “binding site landscape” that is accessible to CRP only when Sxy is present. Figure 2 plots a generic CRP-S site relative to other CRP sites (not real data) and shows that when Sxy is present, CRP-S sites will be bound by CRP even when CRP levels are low. We know this to be true because competence promoters can be turned on even when there is little active CRP in the cell – such as what we see in the sxy-1 hypercompetent mutant during exponential growth. The pil-N promoter (highlighted red) is a CRP-S site that has been mutated to have the canonical bases T6 and A17; it is a high affinity CRP site.

Our working model is that Sxy in required for CRP to form stable interactions at CRP-S sites. However, I suspect that Sxy doesn’t operate through a simple recruitment mechanism. If Sxy were to recruit CRP to CRP-S sites, we would expect a strong Sxy-binding site and a weak motif at the CRP-S site. However, CRP-S sites are the only apparent biding sites in their promoters. Also, the high specificity of CRP for CRP-S sites predicts that CRP-S sites will be occupied even at low CRP concentrations (providing Sxy is present to help with DNA kinking).

Other distinguishing features: The C6 and G17 bases in CRP-S sites are important for promoter function. Changing them to canonical bases T6 and A17 makes a weaker promoter. This is opposite to other CRP sites, where changing bases to match the consensus yields a stronger promoter. This finding suggests that when CRP is able to bind alone to a mutated CRP-S site, it is not as good a transcriptional activator as a CRP-Sxy complex bound to a wildtype CRP-S site.

Conclusion: It is possible that my models and interpretations of experimental data are wrong. Instead, CRP-S sites may have C6 and G17 to make them binding sites for Sxy as well as for CRP. If this alternate interpretation is true, ie. that CRP-S sites are targeted independently by two separate proteins, it is still completely novel and distinguishes CRP-S sites from all the other CRP sites.

Friday, December 28, 2007

The complexities of gene regulation in E. coli

We are currently identifying genes that belong to the Sxy regulon in E. coli. The only other well-characterized Sxy regulon was identified by our work in H. influenzae (link). E. coli’s genome is over 2-times larger than H. influenzae’s and, not surprisingly, the E. coli Sxy regulon contains more genes. The E. coli regulon has an additional level of complexity because many of the Sxy-regulated genes are likely to have additional protein regulators (ie. Sxy-regulated genes also belong to other regulons).

This additional complexity is a consequence of lifestyle. E. coli is a more versatile organism than H. influenzae: it can make most of its organic molecules from scratch (aka. from simple sugars plus a few inorganic nutrients) and it can survive in various different environments.

Because a bacterium must at all times satisfy multiple metabolic requirements, it needs to continuously balance its internal functions while exploiting a potentially ever-changing external environment. Bacteria that inhabit very stable niches (such as H. influenzae, which lives in the cavities of a human host) have a small number of transcription factors, whereas bacteria in more complex environments employ a much larger number of regulators. This relationship has been shown to scale as a power-law in which the number of transcription factors doubles twice as fast as does the total number of genes in a genome (van Nimwegen,E., 2003, Trends Genet., 19, 479), indicating that large bacterial genomes employ disproportionately more complex regulatory networks.

Thus, E. coli’s lifestyle necessitates more sensory and response systems than does that of H. influenzae. Consequently, the genes in E. coli’s Sxy regulon are much more likely to belong to multiple (possibly non-overlapping) regulatory networks in order to fine-tune their expression. This unfortunately makes studying the E. coli Sxy regulon more complicated; we can’t be confident that overexpression of the Sxy protein results in induction of all Sxy-regulated genes.

For H. influenzae, gene expression data coupled with bioinformatic analysis revealed that most genes in the Sxy regulon require only CRP and Sxy for transcription activation in standard culture conditions. In E. coli, some Sxy regulon genes will likely be repressed during growth in standard culture conditions, regardless of whether CRP and Sxy are trying to turn them on. Perhaps conducting E. coli gene induction studies in minimal medium will reduce the activities of repressor proteins and so improve our ability to detect some members of the Sxy regulon.

Fortunately, a substantial body of knowledge surrounds the regulation of some of the genes in E. coli’s Sxy regulon. Thus, although we will have trouble identifying genes that can’t be induced by Sxy in standard lab conditions, we will at least be able to integrate our Sxy regulon data with other regulatory networks.

Monday, December 10, 2007

Latest ICAP results

I have now quantified EcCRP and HiCRP affinity for ICAP and four designer variants, and am currently replicating the experiments. The data looks good and supports my hypotheses, but some additional interesting features of CRP-DNA interactions have revealed themselves.

For example, CRP binding causes DNA to assume a very sharp bend of around 90º, which is achieved through two major kinks near the centre of the DNA site (each ~40º) and lesser kinks at each edge of the binding site (each ~10º). Two of the ICAP variants were designed (in part) to address the importance of the small secondary kinks for HiCRP affinity. As I predicted, HiCRP appears to need favourable interactions with a longer stretch DNA than does EcCRP, possibly to stabilize kinking, but what I didn’t expect is that I can readily detect variations in the degree to which DNA is bent by CRP. The second surprise is that when multiple CRP molecules bind to bait DNA at high protein concentrations in bandshift reactions, HiCRP appears to bind in a stepwise fashion: first one protein and then two (possibly in a cooperative manner). EcCRP, on the other hand, goes very quickly from having only one protein bound to a stage were more than two proteins bind to the same piece of DNA, seemingly in a more haphazard fashion. I suspect this is consistent with my model in which HiCRP is highly selective for DNA sites, whereas EcCRP is much less choosy and will bind all sorts of less favourable sites when the good ones are saturated.

These are interesting results, but they require much more thinking before I make good sense of them. Also, they beg for more experiments and my plate is looking pretty full considering the number of different DNA species (ie. different natural promoters) I still want to test in "simple" affinity experiments.

Friday, November 30, 2007

Success with equations

I have solved the issue of excess versus limiting DNA in bandshift reactions. It turns out that my confusion arose because of the two distinctly different ways that the equation for deriving equilibrium binding constant can be derived: the common (overly simplified) equation for calculating Kd (the dissociation constant, which reflects a protein’s affinity for a DNA site) ignores the important qualifiers of the protein and DNA concentrations in the reaction. Unfortunately, the Kd equation cannot be easily illustrated in this blogger post due to my inability to write equations in a blog, so I will add the equation (and its derivatives) tomorrow when I have time to draw them out and edit this post.

The bottom line is that all of the bandshift experiments I have conducted have been informative, and now I can confidently proceed to the next step of measuring accurate Kd values for CRP binding to different CRP sites. Also, tomorrow we will test a second prep of His-tagged Sxy to see if once again CRP has been co-purified along with Sxy. If positive (ie. CRP was pulled down with His-Sxy), we can easily test how much salt needs to be added to His-Sxy to prevent the co-purification of CRP; the amount of salt needed to block the interaction will give us a sense of the strength of affinity between the two proteins.

Thursday, November 22, 2007

ICAP bandshifts going well

I am using the perfect CRP binding site, ICAP, to measure both the amount of protein active in DNA binding in my protein preps and to calculate EcCRP and HiCRP’s affinities for the perfect binding site and derivatives of this site. My first affinity measurements (which are expressed as equilibrium binding constants, Kobs) indicated that I had lots of active CRP molecules, but EcCRP’s affinity for ICAP was ~100-fold less than that observed by the researchers who first developed ICAP. I am the first to work with HiCRP, so there are no precedents with which to compare HiCRP binding constants.

Initially I thought that I was skewing my Kobs measurements by using too much bait DNA in binding reactions, so I started experimenting with lower bait DNA concentrations. Changing bait DNA concentrations had no effect on Kobs measurements, which was heartening in that I have nice replicate measurements and confirms that my binding reactions are resistant to perturbations. However, this didn’t explain my low Kobs measurements.

Binding reactions are set up such that CRP is presented with a great excess of non-specific DNA; for this I use poly-dIdC, an unnatural DNA molecule that doesn’t have any CRP binding sites. Using non-specific competitor DNA ensures that non-specific DNA binding by CRP won’t contribute to the bandshifts that I am using to measure protein-DNA affinity. This is important because DNA binding proteins are attracted to DNA and so spend a lot of time interacting non-specifically with DNA; when a protein finds a specific binding site, more bonds are formed between it and the DNA so the interaction persists for a longer period. I like to think that including a great excess of non-specific DNA in bandshift reactions is the most biologically relevant approach to studying protein-DNA interactions because in a cell, the vast majority of the chromosome does not have a CRP binding site. Further, I have read and have been told that affinity constants can only be reliably measured in the presence of excess non-specific DNA.

Thus, I was surprised to discover this week as I was re-reading some ICAP papers that the ICAP gang was/is using CRP in excess over ICAP bait DNA, without any non-specific competitor! My first step was to repeat my ICAP bandshift yesterday with low CRP concentrations, but with even lower DNA concentrations (and no competitor DNA). The result is very clear: in the absence of cold competitor, the Kobs value increases ~100-fold to a value similar to previously published values. Thus, in the next few days I will delve deeper into understanding the calculation of affinity constants and will revisit those wise biochemists in the Biochem department.

No matter which approach I take with my bandshifts, I am very pleased with the quality of the data and I’m only a week away from measuring all the ICAP variants. The data will be very informative and will make a great figure for the manuscript that I think is improving by leaps and bounds.

Sunday, November 11, 2007

More results from Sunita and Andrew

Our results that suggest Sxy binds to DNA are exciting but suspicious; all Sxy-DNA binding data can be explained by the presence of contaminating CRP in the Sxy protein preps. This is because the Sxy-DNA binding data is identical to CRP-DNA data. First, EcSxy binds DNA but HiSxy does not. Second, EcSxy greatly prefers the pilA-N (CRP-N mutant) site over the wildtype pilA CRP-S site. Third, when EcSxy and EcCRP are mixed together, only one protein binds to a DNA molecule, suggesting that both proteins target the same site (this is consistent with the pilA-N data).

The next set of experiments is clear: 1) Test whether EcSxy can bind DNA in the absence of cAMP (EcCRP cannot), 2) Test whether EcSxy binds to a pilA promoter that lacks its CRP site, 3) Use western blots to probe for EcCRP in the Sxy preps, and 4) Test DNA binding by EcSxy that has been isolated from a crp- expression strain.

However, several arguments can still be made that EcSxy does in fact bind DNA. First, EcSxy and HiSxy were isolated form the same E. coli strain using the same procedure, thus we would expect EcCRP to contaminate the HiSxy preps as well (which clearly has not happened because HiSxy preps don’t bind DNA). Second, far-western analysis has not detected EcCRP in the Sxy preps.

If tomorrow’s experiments show that EcSxy binds DNA in the absence of cAMP, two new hypotheses need to be addressed: 1) Does EcSxy prefer the pilA-N promoter not because it binds the CRP-N site, but because the CRP-N site makes DNA more bendable than the wildtype CRP-S promoter? 2) Does HiSxy fail to bind DNA because bandshift reaction conditions are not favourable for H. influenzae proteins?