trp operon

The trp operon is a bacterial operon that was first discovered and characterised in E. coli. The operon is formed of a promoter sequence, an operator and five structural genes (trp E through A, in that order) which encode the proteins needed to synthesise the amino acid tryptophan when it is not available environmentally. Unlike the lac operon, which is an inducible system that is activated by the presence of (allo)lactose, the trp operon was the first-discovered example of a repressible system: it is by default transcriptionally active, but inhibited by high levels of tryptophan in the cell.

Further upstream of the trp operon is a gene called the trpR gene: it encodes the monomers of a tetrameric trpR repressor protein that, although constitutively expressed, remains inactive by default. When tryptophan is present in the cell, it acts as a co-repressor, binding to the trpR repressor and enabling it to bind to the operator sequence of the trp operon, thereby inhibiting transcription of the structural genes. This is an important energy-saving system: it prevents wasteful transcription of tryptophan synthesis genes when tryptophan is readily available in the environment. The co-repressor/repressor action of the trpR-tryptophan complex is an example of negative control.

The trp operon is also under a second level of control called attenuation. This level of control acts where, despite the presence of the repressor, transcription of the subsequent DNA has already started. Attenuation acts in the 5' leader sequence (5' UTR), downstream of the operator but upstream of the five structural genes of the operon. The leader sequence can be divided into domains, domains 1 through 4, each of them palindromic and thus capable of base-pairing with their neighbours to form hairpin loops.

Domain 1 contains a start codon (AUG) and a small gene coding for a 14-amino acid protein product. The protein itself is redundant and quickly hydrolysed back into amino acids; however, two of the codons within this gene are for the incorporation of tryptophan. Thus, when the ribosome moves along domain 1, it can only pass quickly when tryptophan-tRNA is readily available in the cell. In high concentrations of tryptophan the ribosome proceeds quickly through domain 1. By the time RNA polymerase is transcribing domain 3, domain 2 is already occupied by the fast-moving ribosome. This blocks the formation of a hairpin loop between domains 2 and 3. However, domain 4 remains available for base-pairing and thus a G/C-rich hairpin loop forms between domains 3 and 4. This hairpin loop is followed by a run of U residues, which form A-U pairs with the DNA being transcribed. The hairpin loop binds allosterically to RNA polymerase, causing it to pause, and the subsequent A-U pairs, being thermodynamically unstable, cause the DNA-RNA duplex to separate, thereby terminating transcription (by rho-independent termination). This is how high [tryptophan] leads directly to transcriptional termination.

In the alternative situation, where tryptophan is not readily available, the ribosome stalls in domain 1 as it waits a long time for tryptophan-tRNAs to appear. RNA polymerase goes ahead to transcribe domains 2 and 3, which this time can form the hairpin loop structure (such that domains 3 and 4 no longer can). The 2-3 hairpin loop does not stop RNA polymerase in its tracks and domain 4 is not rich in U residues; transcription is not terminated. This means that in low [tryptophan] transcription of the structural genes is not inhibited.