Saturday, January 12, 2008

Why CRP-S sites are special

I’m trying to improve my ability to verbally express the significance of our CRP-S work. Here I outline a discussion I had with a former supervisory committee member who asked something along the lines of “Aren’t CRP-S sites simply low affinity CRP sites?” He asked this because many bacterial promoters use regulatory mechanisms that require a transcription factor to bind first to high affinity DNA sites and then, once the good sites are all full, less favourable DNA sites will begin to be occupied.

A classic example is the OmpR system (Figure 1). When only a few OmpR molecules are active in DNA binding (OmpR~P), the high-affinity site in the ompF promoter is bound and ompF is is transcribed. When more OmpR molecules are phosphorylated, low affinity binding sites become occupied, which turns off the ompF promoter and turns on the ompC promoter.

Below I describe the CRP-S model as it is coming to light in H. influenzae. This is a simple model that will improve as we gain a better understanding of Sxy and CRP-S function in E. coli.

Key point: “CRP-S” sites are low-affinity but highly specific CRP sites. CRP binding to CRP-S sites is mechanistically different than for binding to canonical low-affinity sites. Canonical low-affinity sites are low-affinity because CRP has a low-specificity for them, and in general protein affinity and specificity go hand in hand.

Our discovery: CRP sites in E. coli, H. influenzae, and very likely many bacteria, fall into two sub-populations: canonical “CRP-N” sites and unusual “CRP-S” sites. CRP-N and CRP-S are names that Rosie and I coined. Previously studied CRP sites in E. coli belong to the CRP-N group, for example E. coli’s lacZ promoter has a CRP-N site.

Background: E. coli CRP binds as a homodimer, specifically to symmetrical 22 bp DNA sites with the consensus half site 5’-A1A2A3T4G5T6G7A8T9C10T11. The protein makes direct contact with base pairs G:C5, G:C7, and A:T8 in the highly conserved core motif T4G5T6G7A8, and binding induces a localized kink of 43º between positions 6 and 7, wrapping the DNA around CRP and strengthening the association. Though base pair T:A6 is not directly contacted by CRP, it is recognized indirectly because kink formation strongly favours T:A6 over other base pairs. For example, replacement of T:A6 in a consensus CRP site with C:G6 causes an 80-fold reduction in CRP affinity by increasing the free energy required to bend the DNA.

Despite the extensive variation among CRP sites, until our work no significance had been attached to which positions vary. Instead, the degree of similarity of any CRP site to the consensus was proposed to generate an adaptive hierarchy that allows genes with better sites to be preferentially activated at low cAMP concentrations. Similar hierarchies must exist for all DNA-binding proteins – ie. good sites are preferentially bound at low protein concentrations. Many regulatory mechanisms exploit this hierarchy, such as OmpR’s ability to act both as an activator and a repressor. When a sufficient number of OmpR molecules are active in DNA binding, both high- and low-affinity OmpR sites are occupied; the amount of occupancy has a profound influence on the activities of the ompF and ompC promoters.

The CRP-S story is different: CRP-N and CRP-S sites are each best described by a different consensus sequence: CRP-N sites have the core sequence TGTGA whereas CRP-S have TGCGA. DNA sequences in one population differ significantly (in a statistical sense) from sequences in the other population – and we can even detect this by eye.

I have found that CRP-S sites are low-affinity binding sites for CRP in vitro, but in a special way unlike low-affinity CRP-N sites. Low-affinity CRP-N sites have multiple non-consensus bases at key positions; these prevent specific contacts from forming between the protein and DNA site. CRP-S sites are different because in H. influenzae they have all the correct bases required for binding by CRP, except that the non-consensus bases at positions 6 and 17 (see “Background” above) prevent the protein from forming a stable interaction. In other words, all of the more than ten specific contacts between amino acids in CRP and bases and phosphates in a CRP-S site can form, but the DNA site won’t kink because of bases C6 and G17.

Thus, CRP-S are a special, mechanistically distinct, type of low-affinity site: CRP has high specificity for these sites, but critical base substitutions make them poor binding sites. Consequently, CRP-S sites don’t properly fit in the classic hierarchy of high- to low-affinity CRP sites. Instead, CRP-S sites appear to occupy an alternate “binding site landscape” that is accessible to CRP only when Sxy is present. Figure 2 plots a generic CRP-S site relative to other CRP sites (not real data) and shows that when Sxy is present, CRP-S sites will be bound by CRP even when CRP levels are low. We know this to be true because competence promoters can be turned on even when there is little active CRP in the cell – such as what we see in the sxy-1 hypercompetent mutant during exponential growth. The pil-N promoter (highlighted red) is a CRP-S site that has been mutated to have the canonical bases T6 and A17; it is a high affinity CRP site.

Our working model is that Sxy in required for CRP to form stable interactions at CRP-S sites. However, I suspect that Sxy doesn’t operate through a simple recruitment mechanism. If Sxy were to recruit CRP to CRP-S sites, we would expect a strong Sxy-binding site and a weak motif at the CRP-S site. However, CRP-S sites are the only apparent biding sites in their promoters. Also, the high specificity of CRP for CRP-S sites predicts that CRP-S sites will be occupied even at low CRP concentrations (providing Sxy is present to help with DNA kinking).

Other distinguishing features: The C6 and G17 bases in CRP-S sites are important for promoter function. Changing them to canonical bases T6 and A17 makes a weaker promoter. This is opposite to other CRP sites, where changing bases to match the consensus yields a stronger promoter. This finding suggests that when CRP is able to bind alone to a mutated CRP-S site, it is not as good a transcriptional activator as a CRP-Sxy complex bound to a wildtype CRP-S site.

Conclusion: It is possible that my models and interpretations of experimental data are wrong. Instead, CRP-S sites may have C6 and G17 to make them binding sites for Sxy as well as for CRP. If this alternate interpretation is true, ie. that CRP-S sites are targeted independently by two separate proteins, it is still completely novel and distinguishes CRP-S sites from all the other CRP sites.