CYP76M7 Is an ent -Cassadiene C11 a -Hydroxylase Deﬁning a Second Multifunctional Diterpenoid Biosynthetic Gene Cluster in Rice W OA

Biosynthetic gene clusters are common in microbial organisms, but rare in plants, raising questions regarding the evolutionary forces that drive their assembly in multicellular eukaryotes. Here, we characterize the biochemical function of a rice ( Oryza sativa ) cytochrome P450 monooxygenase, CYP76M7, which seems to act in the production of antifungal phytocassanes and deﬁnes a second diterpenoid biosynthetic gene cluster in rice. This cluster is uniquely multifunctional, containing enzymatic genes involved in the production of two distinct sets of phytoalexins, the antifungal phytocassanes and antibacterial oryzalides/oryzadiones, with the corresponding genes being subject to distinct transcriptional regulation. The lack of uniform coregulation of the genes within this multifunctional cluster suggests that this was not a primary driving force in its assembly. However, the cluster is dedicated to specialized metabolism, as all genes in the cluster are involved in phytoalexin metabolism. We hypothesize that this dedication to specialized metabolism led to the assembly of the corresponding biosynthetic gene cluster. Consistent with this hypothesis, molecular phylogenetic comparison demonstrates that the two rice diterpenoid biosynthetic gene clusters have undergone independent elaboration to their present-day forms, indicating continued evolutionary pressure for coclustering of enzymatic genes encoding components of related biosynthetic pathways.


INTRODUCTION
The production of small molecule antibiotics is a broadly conserved biological defense mechanism. If these bioactive natural products are separated from primary metabolism by multiple biosynthetic/enzymatic steps, the corresponding genes must be inherited together, along with the appropriate regulatory mechanisms (e.g., cotranscription), to provide positive selection pressure for their retention. Furthermore, if intermediate metabolites have deleterious effects, negative selection is exerted against inheritance of the corresponding subsets of these genes and/or inappropriate suppression of downstream enzymatic activity. This push-pull combination has led to the assembly of biosynthetic gene clusters for multistep specialized metabolism across a wide range of organisms (Fischbach et al., 2008). These include not only microbes, such as bacteria and fungi, in which horizontal gene transfer increases the selective pressure for clustering, but plants as well, where such clustering is much less common, with only a few such cases known (Frey et al., 1997;Qi et al., 2004;Field and Osbourn, 2008). The rarity of biosynthetic gene clusters in plants raises the question of how such clusters were assembled in these multicellular eukaryotic organisms.
Rice (Oryza sativa) produces a complex mixture of antibiotic diterpenoid natural products in response to fungal infection (e.g., by the blast pathogen Magnaporthe grisea), as well as a separate set in response to bacterial infection (e.g., by the leaf blight pathogen Xanthomonas campestris), providing examples of both antifungal and antibacterial phytoalexins (Peters, 2006;Toyomasu, 2008). These phytoalexins all fall into the labdanerelated diterpenoid subfamily, the founding members of which are the gibberellin (GA) phytohormones. Biosynthesis of these natural products is characteristically initiated by sequential cyclization reactions catalyzed by mechanistically distinct, yet phylogenetically related, diterpene synthases (Peters, 2006;Toyomasu, 2008). First, cyclization of the acyclic universal diterpenoid precursor [E,E,E]-geranylgeranyl diphosphate (GGPP) by class II diterpene cyclases, typically a labdadienyl/ copalyl diphosphate (CPP) synthase (CPS), occurs. Second, this bicyclic intermediate is generally further cyclized by stereospecific class I diterpene synthases, sometimes termed kaurene synthase-like (KSL) because of their similarity to the corresponding enzyme in GA biosynthesis. Oxygen is then typically inserted into the resulting diterpene olefin by heme-thiolate cytochromes P450 (CYP) monooxygenases en route to the production of bioactive natural product(s). While the disparate CYP monooxygenases are all presumed to be, at least distantly, related to each other, they have been divided into numbered families and lettered subfamilies that are clearly homologous ($40 and >55% amino acid sequence identity, respectively), with the number following the subfamily letter designating individual P450s (Werck-Reichhart and Feyereisen, 2000).
Given the importance of rice as a staple food crop and the knowledge of its genome sequence (Project, 2005), the metabolism of GA and other labdane-related diterpenoids has been extensively investigated in this model cereal plant (Peters, 2006). Although rice contains four CPS and 10 KSL, mutational analysis demonstrated that only one member of each of class of diterpene synthase (i.e., Os-CPS1 and Os-KS1) is involved in GA biosynthesis (Sakamoto et al., 2004). Biochemical characterization of the remaining CPS and KSL (Cho et al., 2004;Nemoto et al., 2004;Otomo et al., 2004aOtomo et al., , 2004bPrisic et al., 2004;Wilderman et al., 2004;Xu et al., 2004Xu et al., , 2007bKanno et al., 2006;Morrone et al., 2006) has assigned a unique metabolic function to each (Figure 1). By contrast, while P450 monooxygenases have been implicated in GA biosynthesis (Kato et al., 1995;Shimura et al., 2007), little is known about which of the >350 known rice CYPs are required and/or what their exact roles in diterpenoid phytoalexin biosynthesis are (Peters, 2006).
Intriguingly, the rice genome contains two gene clusters with genes encoding CPS, KSL, and CYP (Sakamoto et al., 2004). The smaller gene cluster on chromosome 4 is dedicated to the production of momilactones and contains the relevant, consecutively acting syn-CPP synthase (Os-CPS4) and syn-pimaradiene synthase (Os-KSL4) genes . In addition, this cluster contains a gene encoding a dehydrogenase that catalyzes the final step in the production of momilactone A (Os-MAS) and two closely related P450s (CYP99A2and3), one or both of which are required in an undefined role(s) in momilactone biosynthesis (Shimura et al., 2007). The larger cluster on chromosome 2 contains the gene encoding the phytoalexin-specific ent-CPP synthase Os-CPS2 , three ent-CPPspecific KSLs (Os-KSL5-7), and several P450s from the CYP71 and 76 families. Although microarray transcriptional analysis has demonstrated that some of the coclustered CYP genes are coregulated with Os-CPS2 and Os-KSL7 (Okada et al., 2007), and members of the CYP71 and 76 families in other plant species function in terpenoid metabolism (Lupien et al., 1999;Collu et al., 2001;Ralston et al., 2001;Wang et al., 2001), a role for any of these monooxygenases in rice diterpenoid phytoalexin biosynthesis has remained conjectural. Here, we report that one of these coclustered and coregulated P450s, CYP76M7, is an entcassadiene-specific C11a-hydroxylase that seems to catalyze an early step in phytocassane biosynthesis and thus defines a second diterpenoid biosynthetic gene cluster, whose unique metabolic multifunctionality provides some insight into biosynthetic gene cluster assembly in plants.

Extension of the Chromosome 4 Diterpenoid Biosynthetic Gene Cluster
It was originally suggested that the putative diterpenoid biosynthetic gene cluster on chromosome 2 contained four P450s (Sakamoto et al., 2004), which we found to correspond to CYP71Z6 and 7 and CYP76M6 and 7 by BLAST searches with the corresponding gene sequences against the cytochrome P450 database. Previous microarray/transcriptional analysis suggested that some of these CYP genes are also coregulated with Os-CPS2 and Os-KSL7, with their transcription being induced ;4 h after elicitation of rice cell cultures with the fungal cell wall component chitin (Okada et al., 2007). A number of other CYP genes exhibit an analogous transcriptional induction pattern, and we found that two of these coregulated CYPs (CYP76M5 and 8) are immediately adjacent to the previously reported chromosome 2 diterpenoid biosynthetic gene cluster, The cyclases and corresponding reactions are indicated, along with the downstream natural products, where known. Thicker arrows indicate enzymatic reactions specifically involved in GA metabolism; dashed arrows indicate multiple enzymatic reactions. meriting inclusion in this cluster. The resulting cluster then contains 10 genes and spans ;245 kb on chromosome 2 and is comparable to the five-gene, ;170-kb cluster on chromosome 4 ( Figure 2). Notably, previous results have demonstrated that the transcription of some genes in the chromosome 2 cluster is not elicited by chitin (Okada et al., 2007). Strikingly, the coregulated genes are not grouped together.

CYP76M7 Hydroxylates ent-Cassadiene
Given the coclustering and coregulation of four CYPs (CYP71Z7 and CYP76M5, 7, and 8) with the Os-CPS2 and Os-KSL7 known to act in phytocassane biosynthesis (Peters, 2006;Toyomasu, 2008), we hypothesized that these monooxygenases would similarly play a role in the production of this group of antifungal phytoalexins. Genes for all four of these CYPs were identified in the rice full-length cDNA sequencing project at KOME (Kikuchi et al., 2003) and obtained from this source. These were then expressed in the yeast (Saccharomyces cerevisiae) strain WAT11, in which the endogenous NADPH-cytochrome P450 reductase (CPR) had been replaced by a corresponding gene from Arabidopsis thaliana (At-CPR1; Urban et al., 1997). However, no P450 expression or activity was detected. To further assess the role of these CYPs, we attempted to express them in insect cells (Spodoptera frugiperda), which has proven to be a successful alternative method for the heterologous expression of other plant CYP (Jennewein et al., 2001). Using an insect cell (Sf21)-baculovirus (Autographa californica) expression system, we found that CYP76M7, although not the other three CYPs, reacts with entcassadiene (molecular mass = 272 D) fed to either microsomes or intact cells, with the resulting product appearing to be hydroxylated on the basis of its molecular mass (288 D). However, turnover was relatively limited, and it was not possible to obtain sufficient amounts of the hydroxylated product for structural analysis.

Identification of Hydroxylated Product as 11a-Hydroxy-ent-Cassadiene
To obtain larger amounts of the hydroxylated product, we employed the modular metabolic engineering system that we developed previously (Cyr et al., 2007) and attempted to coexpress CYP76M7 and a rice CPR (Os-CPR1) along with GGPP, ent-CPP, and ent-cassadiene synthases in Escherichia coli. However, while ent-cassadiene was produced in significant quantities, no hydroxylated derivatives were detected using the full-length CYP76M7. Based on our experience with recombinant expression of the kaurene oxidase CYP701A3 (D. Morrone and R.J. Peters, unpublished data), we modified the N terminus of CYP76M7 for functional bacterial expression. Specifically, we replaced the first 33 amino acids with a 10-amino acid Lys-rich sequence, based on a modification used for bacterial expression of the CYP2B subfamily of mammalian P450s (Scott et al., 2001). With this modified CYP76M7, we were able to detect significant hydroxylation (>50% conversion) of ent-cassadiene (Figure 3), resulting in the same product observed with the unmodified P450 expressed in insect cells. Accordingly, it was possible to produce and purify sufficient amounts of the hydroxylated ent-cassadiene for structural characterization by nuclear magnetic resonance (NMR). This analysis demonstrated that CYP76M7 performs C11a-hydroxylation of ent-cassadiene [forming 11-(S)-hydroxy-entcassa-12,15-diene]. In addition, coexpression of the modified CYP76M7 with other diterpene synthases demonstrated that this P450 is specific for ent-cassadiene, with only trace amounts of hydroxylation (<1% conversion) observed for ent-pimaradiene, ent-sandaracopimaradiene, ent-kaurene, ent-isokaurene, synpimaradiene, syn-stemarene, or syn-stemodene.

Enzymatic Characterization of Recombinant CYP76M7
It was possible to measure the enzymatic activity of the modified recombinant CYP76M7 in vitro, which enabled more detailed enzymatic characterization. These in vitro assays required CPR supplementation, which was provided by the addition of exogenously produced At-CPR1. Intriguingly, in these assays, the production of an oxo-group (molecular mass = 286 D) was sometimes observed (see Supplemental Figure 1 online), in place of the expected C11a-hydroxy-ent-cassadiene. This oxocontaining compound also was produced when purified C11ahydroxy-ent-cassadiene was fed to bacteria coexpressing CYP76M7 and Os-CPR1 (see Supplemental Figure 2 online), Black boxes represent genes induced by chitin elicitation, while white boxes represent those not induced (Okada et al., 2007), with the arrowheads representing the direction of transcription. Note that no other genes appear to be present in these regions. demonstrating that it represents further oxidation of the C11ahydroxyl to a keto group. Uncoupling due to poor or limited association with recombinant CPR may result in the catalytic cycle being aborted after the initial oxidation. Kinetic constants were determined by simply measuring the rate of formation of C11a-hydroxy-ent-cassadiene in reactions where little of the C11-keto-ent-cassadiene product was made, whereupon CYP76M7 was found to exhibit a K m of 39 6 10 mM and specific activity of 0.13 6 0.02 mmoles product/mg protein/min ( Figure 4). No activity was observed in vitro for the other diterpenes produced by rice (i.e., ent-pimaradiene, ent-sandaracopimaradiene, ent-kaurene, ent-isokaurene, syn-pimaradiene, syn-stemarene, or syn-stemodene).

Physiological Relevance of CYP76M7 Activity
The physiological relevance of the C11a-hydroxy-ent-cassadiene and C11-keto-ent-cassadiene products of CYP76M7 was investigated by metabolite analysis of induced and noninduced rice leaves. Detection of C11a-hydroxy-ent-cassadiene and C11-keto-ent-cassadiene by liquid chromatography-tandem mass spectrometry (LC-MS/MS) was optimized using purified compound. Extracts of rice leaves induced with methyl jasmonate, which has been demonstrated to induce the production of antifungal phytoalexins, such as the phytocassanes (Peters, 2006), were found to contain C11a-hydroxy-ent-cassadiene (Table 1), although C11-keto-ent-cassadiene was not detected. By contrast, neither compound was found in noninduced leaf extracts. The relevance of these findings is not entirely clear, as C11a-hydroxy-ent-cassadiene might accumulate as a metabolic dead-end rather than as a biosynthetic intermediate, while C11keto-ent-cassadiene would not accumulate if the subsequently acting enzyme were not rate limiting. Nevertheless, the identification of at least one of the CYP76M7 products demonstrates that the associated hydroxylation reaction reported here does occur in planta, with the observed accumulation of C11ahydroxy-ent-cassadiene upon induction consistent with the transcriptional pattern previously reported for CYP76M7. Given the presence of a C11-keto group in all of the identified phytocassanes from rice, CYP76M7 presumably catalyzes an early step in phytocassane biosynthesis ( Figure 5), and the timing of induction of both CYP76M7 transcription and activity matches that for the production of phytocassanes.

Molecular Phylogenetic Analysis of the Coclustered Enzymatic Gene Families
The results presented above extend the functionality of the diterpenoid phytoalexin biosynthetic gene cluster on chromosome 2 to include CYP (at least CYP76M7), along with the previously noted sequentially acting CPS and KSL diterpene synthases Wilderman et al., 2004), just as found in the cluster on chromosome 4 (Okada et al., 2007). Since the work presented above indicates that both of these diterpenoid biosynthetic gene clusters in rice contain sequentially acting CPS, KSL, and CYP ( Figure 2), we performed molecular phylogenetic analysis of these shared enzymatic gene families to provide some insight into the assembly of these clusters. (A) Gas chromatography (GC)-MS chromatogram of extract from E. coli engineered for production of ent-cassadiene and coexpressing CYP76M7 and rice CPR1 (1, ent-cassadiene; 2, 11a-hydroxy-entcassadiene). (B) Mass spectra for peak 1 (ent-cassadiene). (C) Mass spectra for peak 2 (11a-hydroxy-ent-cassadiene). The measured initial rates from duplicate assays (data points are the average and errors bars represent the SD) and fit to the Michaelis-Menton equation (R 2 = 0.96) are shown.
As we have previously noted , the single class II diterpene cyclases found in each cluster, Os-CPS2 and Os-CPS4, are more closely related to each other than to the Os-CPS1 required for GA biynthesis. Indeed, Os-CPS1 is more closely related to a CPS found in Zea mays (Zm-CPS1) than to either Os-CPS2 or Os-CPS4. Thus, for molecular phylogenetic analysis, the CPS from the dicot Arabidopsis (At-CPS) was used as the outgroup sequence ( Figure 6A; see Supplemental Figure 3 and Supplemental Data Set 1 online). This analysis demonstrates that there was a gene duplication following the separation of mono-and dicot plants that led to the Os-CPS2 and Os-CPS4 found in the clusters on chromosome 2 and 4, respectively.
Molecular phylogenetic analysis of the class I diterpene synthase KSL family from rice (Os-KSL), using the ent-kaurene synthase (KS) from Arabidopsis (At-KS) as the outgroup sequence, indicates that there are two related groups of Os-KSL ( Figure 6B). One of these groups is composed of Os-KS1, which is involved in GA biosynthesis, along with the ent-cassadiene producing Os-KSL7 found in the chromosome 2 cluster and the Os-KSL4 found in the chromosome 4 cluster. The other group includes the closely related and presumably recently duplicated Os-KSL5/Os-KSL6 pair of class I diterpene synthases, which are also found in the chromosome 2 cluster, indicating a separate evolutionary origin for these relative to the coclustered Os-KSL7.
A previous molecular phylogenetic analysis demonstrated that the CYP99 family falls within the CYP71 family and presumably forms a subfamily within the broader CYP71 family (Nelson et al., 2004). However, this does not necessarily mean that the CYP71Z subfamily members found in the chromosome 2 cluster and the CYP99A subfamily members found in the chromosome 4 cluster are closely related, as the CYP71/99 family is the largest CYP family in rice, being composed of 107 genes (although 31 are thought to be pseudogenes). Indeed, the majority of the CYP71Z subfamily members are scattered elsewhere in the genome. Furthermore, while the CYP71/99 and CYP76 families both fall within the CYP71 clan (Nelson et al., 2004), this relationship reflects an ancient division as both the CYP71/99 and CYP76 families are also found in Arabidopsis. Accordingly, our molecular phylogenetic analysis of the various CYP contained within the rice diterpenoid gene clusters includes not only the other subfamily group members scattered across the rice genome, but also the most closely related CYP from a different subfamily in both rice and Arabidopsis as outgroup sequences for each subfamily of interest and the rice kaurene oxidase (CYP701A6) as the outgroup sequence for the overall analysis ( Figure 6C). This analysis further indicates some complexity in the CYP76M subfamily, as the subfamily members in the chromosome 2 cluster are closely related to the CYP76M14 that is found elsewhere in the rice genome. Examination of the CYP molecular phylogeny also indicates that several CYP genes within the clusters have undergone relatively recent tandem duplication (i.e., CYP99A2 and 3, CYP76M7 and 8, and CYP71Z6 and 7), much as has been hypothesized for Os-KSL5 and 6 (Xu et al., 2007a).

DISCUSSION
The results presented here demonstrate that CYP76M7 is a cytochrome P450 monooxygenase that catalyzes the insertion of oxygen at the C11a position in the labdane-related diterpene ent-cassadiene. The resulting C11a-hydroxy-ent-cassadiene can be found in planta upon methyl jasmonate induction, matching both the previously demonstrated mRNA transcript accumulation of CYP76M7 (Okada et al., 2007) and the inducible nature of phytocassane production (Koga et al., 1995), presumably reflecting a role for CYP76M7 in the biosynthesis of these phytoalexins. Given the typically conserved functionality among CYP subfamilies, these results further suggest that other CYP76M subfamily members are also involved in terpenoid biosynthesis, as has been found for the functionally characterized CYP71D subfamily members (Lupien et al., 1999;Ralston et al., 2001;Wang et al., 2001), which provides some general guidelines for further investigating the remaining nine CYP76M subfamily members in rice, as well as other CYP76M subfamily members identified in other plant species.
Definition of a role for CYP76M7 in phytoalexin diterpenoid biosynthesis also increases the number of CYP families acting in such specialized metabolism. In particular, the only previous angiosperm cytochrome P450 functionally identified as operating in specialized diterpenoid metabolism is the CYP71D16 involved in the production of cembratriendiol by tobacco (Nicotiana tabacum) trichomes (Wang et al., 2001). Interestingly, while both the CYP71 and 76 families fall within the large CYP71 clan, the cytochromes P450 from gymnosperms involved in specialized diterpenoid metabolism, including CYP720B1, which is involved in conifer resin acid biosynthesis (Ro et al., 2005) and the various CYP725 family members involved in taxoid biosynthesis (Jennewein et al., 2001;Schoendorf et al., 2001;Jennewein et al., 2003; fall within the separate CYP85 clan. Nevertheless, the role for CYP76M7 in phytoalexin biosynthesis indicated here adds the CYP76 family to those involved in diterpenoid specialized metabolism. In addition, it seems likely that other CYP families from these broader clans will contain members and/or subfamilies dedicated to specialized diterpenoid metabolism. Having provided strong evidence that CYP76M7 plays a role in phytocassane biosynthesis in rice by acting upon the ent-cassadiene product of the Os-CPS2 and Os-KSL7 diterpene synthases that are also encoded within the chromosome 2 cluster indicates that this cluster is functionally composed of consecutively acting CPS and KSL diterpene synthases and at least one CYP, much as previously shown for the diterpenoid biosynthetic gene cluster on chromosome 4 (Shimura et al., 2007). Accordingly, we performed molecular phylogenetic analysis to investigate the evolution of these two diterpenoid biosynthetic gene clusters from the rice genome.  Each box represents a gene in the cluster. In those cases where duplication has been postulated to occur after cluster assembly, a precursor gene is indicated (i.e., Os-KSL5/6 as the precursor to the current Os-KSL5 and 6, with "x" designating the precursor to the various CYP subfamily members found in the clusters).
Intriguingly, the CPSs are the only genes in the rice diterpenoid biosynthetic gene clusters that are more closely related to those found in the other cluster than to paralogs found elsewhere in the genome ( Figure 6). While some argument could be made that the evolutionary relationship between Os-KSL4 and Os-KSL7 reflects early incorporation of a common precursor in an ancestral cluster, both are more closely related to Os-KS1, which is involved in GA biosynthesis and is found elsewhere in the genome, than they are to each other. Furthermore, the closely related and presumably recently duplicated Os-KSL5 and 6 clearly have a separate evolutionary origin. Similarly, while the fact that CYP99 is actually a subfamily of CYP71 might be used to argue for early inclusion of a shared evolutionary precursor to the CYP71Z and CYP99A subfamily members in an ancestral cluster, this is undermined by the large numbers of other CYP71 family members (including the majority of CYP71Z subfamily members) found scattered elsewhere throughout the rice genome. In addition, the CYP76 family is sufficiently different to effectively rule out any recent shared origin for those family members found in the chromosome 2 cluster with either the coclustered CP71Z or the CYP99A found in the chromosome 4 cluster. Finally, the Os-MAS dehydrogenase found in the chromosome 4 cluster has no paralog in the chromosome 2 cluster. Thus, from our molecular phylogenetic analysis, it seems likely that these two diterpenoid biosynthetic gene clusters were independently assembled, although it is also possible that there was an ancestral cluster containing precursors to Os-CPS2 and 4, Os-KSL4 and 7, and CYP71Z and 99A. However, even if this latter case were true, further assembly would still have been required, with the chromosome 4 cluster picking up the MAS dehydrogenase, while the chromosome 2 cluster would have needed to add the KSL5 and 6 precursor, as well as CYP76M family members (Figure 7). In any case, such differential assembly implies that there was significant selective pressure driving this process.
Regardless of the precise assembly process, the chromosome 2 cluster defined here is a novel example of a multifunctional biosynthetic cluster. Os-KSL6 produces the ent-isokaurene precursor of the antibacterial oryzalides/oryzadiones, while Os-KSL7 and, as now indicated here, CYP76M7 are involved in biosynthesis of the antifungal phytocassanes (Figures 1 and 5). The previously reported difference in transcriptional regulation for these genes (Figure 2) also is consistent with their distinct roles in phytoalexin biosynthesis. This metabolic multifunctionality is unprecedented in the previously identified biosynthetic gene clusters from plants (Frey et al., 1997;Qi et al., 2004;Field and Osbourn, 2008) and helps define the underlying selective pressure leading to the at least partially independent differential assembly of the two rice diterpenoid biosynthetic gene clusters.
In particular, it is notable that Os-CPS2 is the only enzymatic gene from the chromosome 2 cluster that is clearly shared between phytocassane and oryzalide/oryzadione biosynthesis. While Os-CPS2 produces the enantiomeric form of CPP that could be an intermediate in GA phytohormone metabolism (Figure 1), knocking out Os-CPS1 is sufficient to essentially abolish GA production (Sakamoto et al., 2004). This rules out a role for Os-CSP2 in such primary metabolism, and its transcriptional regulatory pattern (Otomo et al., 2004a;Prisic et al., 2004) indicates that Os-CPS2 acts in more specialized (i.e., phyto-alexin) metabolism instead. Accordingly, it appears that all the genes in both clusters are devoted to specialized metabolism, as none of the associated KSL enzymes make the ent-kaurene intermediate that is relevant to GA biosynthesis, and none of the characterized members of the CYP71/99 or CYP76 families have been found to be involved in primary metabolism. Thus, the selective force driving the differential assembly of the two rice diterpenoid biosynthetic gene clusters must be related to their role in more specialized metabolism. Indeed, in each of the identified cases, plant biosynthetic gene clusters are involved in specialized rather than primary metabolism.
As presented by Field and Osbourn (2008), the assembly of such specialized biosynthetic gene clusters in plants (as well as other organisms) could be driven by a variety of factors. These include the need for coregulation and inheritance of the full set of enzymatic genes necessary for biosynthesis of a bioactive natural product, as well as avoidance of toxic intermediate accumulation (i.e., from inheritance of only the corresponding subset of enzymatic genes). Notably, the lack of any recognizable regulatory pattern within the chromosome 2 cluster reported here suggests that coregulation was not a primary factor driving cluster assembly in at least this particular case, as does the coclustering of two functionally distinct biosynthetic pathways (i.e., enzymatic genes involved in either antifungal phytocassane or antibacterial oryzalide/oryzadione biosynthesis), which are differentially regulated (i.e., induced in response to either fungal or bacterial infections). Accordingly, we hypothesize that biosynthetic gene cluster assembly in plants is driven by the need for inheritance of complete biosynthetic pathways. In particular, the inheritance of complete biosynthetic pathways is important because the selective pressure for conservation of any enzymatic gene solely involved in specialized metabolism requires coinheritance of all the enzymatic genes necessary for production of the relevant bioactive natural product without the accumulation of toxic intermediates.

General Procedures
Unless otherwise noted, chemicals were purchased from Fisher Scientific, and molecular biology reagents from Invitrogen. Gene mapping was based on the annotated rice genome sequence at GenBank. CYP nomenclature was determined via BLAST searches at the Cytochrome P450 homepage maintained by David Nelson (http://drnelson.utmem.edu/ CytochromeP450.html). GC was performed with a Varian 3900 GC with Saturn 2100 ion trap mass spectrometer in electron ionization mode (70 eV). Samples (1 mL) were injected in splitless mode at 508C and, after holding for 3 min at 508C, the oven temperature was raised at a rate of 148C/min to 3008C, where it was held for an additional 3 min. MS data from 90 to 600 m/z were collected starting 12 min after injection until the end of the run. LC was performed with Agilent 1100 series HPLC instruments, with LC-MS/MS analysis performed by a coupled Agilent MSD ion trap mass spectrometer in positive ion mode on an instrument located in the W.M. Keck Metabolomics Research Laboratory located at Iowa State University.

Recombinant Constructs
Genes for the four CYPs that are coclustered and coregulated with Os-CPS2 and Os-KSL7 (CYP71Z7 and CYP76M5, 7, and 8) were obtained from the KOME rice cDNA databank (Kikuchi et al., 2003). These were transferred into the Gateway vector system via PCR amplification, with incorporation of a consensus Kozak sequence and 6xHis tag at the 59 end and directional topoisomerase-mediated insertion into pENTR/SD/ D-TOPO and verified by complete sequencing. The resulting genes were then transferred via directional recombination to the yeast expression vector pYES-DEST52, the insect cell expression vector pDEST8, and the T7-promoter expression vector pDEST14. CYP76M7 was modified for functional bacterial expression in a two-stage PCR process, first removing 33 codons from the 59 end of the open reading frame and then adding 10 new codons (encoding the amino acid sequence MAKKTSSKGK). The resulting construct was inserted into pENTR/SD/ D-TOPO and verified by complete sequencing. For bacterial coexpression of the requisite redox partner, we cloned a rice CPR (Os-CPR1), also obtained from KOME, into the second multiple cloning site of the pCDFDuet dual expression vector using the NdeI and KpnI restriction sites (Novagen). For insertion of the various CYPs, a DEST cassette was then inserted into the first multiple cloning site using the NcoI and NotI restriction sites to enable transfer via directional recombination with Gateway pENTR-derived constructs.

Recombinant Expression in Yeast
Yeast CYP expression was performed following the manufacturer's directions for pYES-DEST52. Briefly, the pYES-DEST52 CYP constructs were transformed into the WAT11 strain of yeast (Urban et al., 1997), with selection on SC-Ura media. For expression, overnight cultures were inoculated (1:20) into SC-Ura with 2% (w/v) galactose, and cells were harvested at various time points from 0 to 24 h after inoculation. To examine CYP expression, cell extracts were run on 12% SDS polyacrylamide gels, and the separated proteins transferred to Hybond-P polyvinylidene difluoride membrane (Amersham Biosciences) for protein gel blot analysis using standard procedures (Harlow and Lane, 1988). Dilutions of 1:1500 and 1:2000 were used for the primary rabbit anti-His antiserum and secondary goat anti-rabbit Ig-peroxidase conjugate (Sigma-Aldrich), respectively. Bound antibodies were visualized with the ECL Plus Western Blotting Detection System (Amersham Biosciences). In none of the examined cases was CYP expression detected.

Recombinant Expression in Insect Cells
Sf21 cells (Vaughn et al., 1977) were maintained in TC-100 insect cell medium (Sigma-Aldrich) supplemented with FBS (Gibco-BRL) to a final concentration of 10% and antibiotics (1 unit penicillin/mL and 1 mg streptomycin/mL; Sigma-Aldrich). The cell cultures were maintained at 288C as monolayers in screw-capped plastic flasks (Falcon).
Recombinant baculoviruses were constructed with the Bac-to-Bac baculovirus expression system (Invitrogen). Briefly, the CYPs were transferred from the pDEST8 vectors by recombination with the bMON14272 baculovirus shuttle vector in Escherichia coli following the manufacturer's instructions, but using 100 mg/mL kanamycin, 14 mg/mL gentamicin, and 20 mg/mL tetracycline for selection, and the resulting recombinant bacmid DNA was then isolated and transfected into Sf21 cells. The resulting recombinant viruses were amplified and their titer determined in Sf21 cells using standard procedures (O'Reilly et al., 1992).
Expression of CYP proteins by the recombinant baculoviruses was performed in Sf21 insect cells (Vaughn et al., 1977;Wickham et al., 1992). The cells were seeded into 150-cm 2 cell culture flasks and infected with recombinant baculoviruses at a multiplicity of infection of 10 plaqueforming units/cell. Cells were harvested 72 h after infection, with CYP expression readily detected by protein gel blot analyses performed as described above.
Microsomes from the harvested cells were prepared by differential centrifugation. Briefly, cells were pelleted by centrifugation at 3000g at 48C for 10 min. The pellets were resuspended in half a cell culture volume of 100 mM sodium phosphate buffer, pH 7.8, and repelleted at 3000g for 10 min, washed in ice-cold cell lysate buffer (100 mM sodium phosphate, pH 7.8, 1.1 mM EDTA, 0.1 mM DTT, 0.5 mM PMSF, 1/1000 v/v Sigma-Aldrich protease inhibitor cocktail, and 20% glycerol), repelleted at 3000g for 10 min, and resuspended in 1/50 cell culture volume of cold cell lysate buffer. The cells were lysed by sonification twice for 30 s on ice and by vortexing for 15 s. The lysate was centrifuged at 10,000g for 20 min at 48C in a microcentrifuge, and the supernatant was further centrifuged in a Ti-70 rotor at 120,000g for 1 h in a Beckman-Coulter Optima ultracentrifuge to pellet the microsomes. The microsomal pellet was resuspended in 500 mL cold cell lysate buffer and used immediately or flash-frozen in liquid N 2 and stored at 2808C for up to 1 month.
In vitro assays were performed in a 500-mL assay mixture containing the isolated microsomes (500 mg), 0.4 mM NADPH, 300 mM substrate, 2 mM reductase, 1 mM DTT, 5 mM FMN, 5 mM FAD, as well as NADPH regenerating system comprised of 2 mM glucose-6-phosphate and 0.5 units of glucose-6-phosphate dehydrogenase. The reaction mixture was incubated at 288C for 6 h. The resultant product was extracted thrice with an equal volume of ethyl acetate, dried under a gentle stream of N 2 gas, and dissolved in hexane for GC-MS analysis.
Whole-cell assays were performed by feeding 10 mM ent-cassadiene to CYP-expressing sf21 cells 2 d after infection. The cells were grown for two more days, the media and cells were sonicated, and diterpenoids were extracted for GC-MS analysis as above.

Recombinant Expression in E. coli
For bacterial expression, we combined coexpression of Os-CPR1 and CYP76M7 via the pCDFDuet vector construct described above with our previously described modular diterpene metabolic engineering system (Cyr et al., 2007). Specifically, we coexpressed these genes with genes encoding GGPP and ent-CPP synthases on the cocompatible pGGeC vector and the ent-cassadiene synthase (Os-KSL7) gene as a glutathione S-transferase fusion expressed from the also cocompatible pDEST15. While expression of the full-length CYP76M7 in this context did not lead to production of any hydroxylated ent-cassadiene, expression with the modified CYP76M7 led to conversion of >50% of the endogenously produced ent-cassadiene to the same hydroxylated derivative observed with insect cell-expressed CYP76M7 fed ent-cassadiene. Substituting other diterpene synthases for Os-KSL7 and/or the ent-CPP synthase (as appropriate; see Cyr et al., 2007) in this bacterial system did not result in significant conversion of the corresponding various other diterpenes, which were all produced in good yield, to any recognizable oxidized derivatives.

Diterpenoid Production
To produce sufficient amounts of the novel enzymatic product for NMR analysis, we increased the isoprenoid precursor supply by incorporating the bottom half of the mevalonate-dependent isoprenoid pathway from yeast in the engineered bacteria, using the previously described pMBI vector and supplementation with 20 mM mevalonolactone (Martin et al., 2003), much as previously described (Morrone et al., 2008). Diterpenoids were extracted from a 3-liter culture (media and cells) solution consisting of an equal volume of ethyl acetate and hexanes (50:50). These organic extracts were pooled and dried by rotary evaporation. The residue was dissolved in 5 mL 45% methanol/45% acetonitrile/10% deionized water and the diterpenoids purified by HPLC with a Supelcosil LC-18 column (4.6 3 250 mm, 5 mm) and a 0.5 mL/min flow rate. After binding, the column was washed with 20% acetonitrile/water (0 to 5 min) and eluted with 20 to 100% acetonitrile (5 to 15 min), followed by a 100% acetonitrile wash (15 to 30 min). The fraction containing the novel diterpenoid (retention time: 21.5 to 22 min) was dried under a gentle stream of N 2 gas and then dissolved in 0.5 mL deuterated chloroform (CDCl 3 ; Sigma-Aldrich). This evaporation resuspension process repeated two more times to completely remove the protonated acetonitrile solvent, resulting in a final estimated ;2.5 mg of the novel diterpenoid.

Kinetic Analysis
Truncated and modified CYP76M7 was expressed in E. coli C41 using the pCDFDuet vector and Os-CPR1 described above. The E. coli C41 cells used to produce CYP76M7 for in vitro kinetic analysis were also transformed with pGro7, an arabinose-inducible, groES-groEL protein chaperone expressing plasmid (TaKaRa) to improve the yield of active P450. Expression cultures were grown in TB and induced with 1 mM isopropyl b-D-1-thiogalactopyranoside and 2 mg/mL arabinose upon reaching an OD A 600 of 0.8. Cultures were induced for 40 h at 288C and complemented with 1 mM thiamine, 2 mg/L riboflavin, and 75 mg/L d-amino levulinic acid at the time of induction. Upon completion of induction, cultures were centrifuged and the cell pellet resuspended in 5% culture volume cold membrane prep buffer (MPB), consisting of 0.1 M Tris-HCl, pH 7.2, 20% glycerol, 0.5 mM EDTA, 1 mM DTT, and 13 protease cocktail inhibitor. Lysozyme was added to 0.2 mg/mL and stirred for 10 min at 48C. Spheroblasts were pelleted at 5000g for 10 min and subsequently resuspended in 10% culture volume of cold MPB. Very brief sonication was used to open spheroblasts. Concentrated spheroblasts were used immediately. Kinetic assays were performed in vitro by determining the initial velocity for various substrate concentrations following spheroblast preparation. However, the amount of cytochrome P450 present could not be quantified due to the lack of the typical peak at 450 nm in CO binding difference spectra. To ensure a constant supply of reducing equivalents, the NADPH regeneration system described above was used. Additionally, At-CPR1 was added to enhance reducing potential. Concentrated spheroblasts were diluted into 5-mL aliquots of MPB; 1-mL aliquots were removed at the various time points and the reactions halted by the addition of 1 mL 1 M HCl, immediately followed by vortexing, with a 4 mL ethyl acetate overlay added later with additional vortexing to assist extraction. The organic extract was removed and the reaction further extracted three times with 5 mL of hexanes. An internal standard of entcopalol was used to standardize extraction efficiency, and authentic C11a-hydroxy-ent-cassadiene was used to generate a standard curve for quantification. Product identification was confirmed by GC-MS, and quantification was performed by GC-flame ionization detection. The reaction appeared to be in the linear range for the first 15 min. Following precipitation with 1% (v/v) trichloroacetic acid and washing of spheroblasts, protein quantification was performed using the Bio-Rad Protein Assay, which is based on the Bradford method, and 8 6 4 mg protein was found to be present.

Metabolite Analysis
Rice plants (Orzya sativa ssp Nipponbare) were cultivated to the 6th leaf stage in growth chambers under cycles of 12 h light at 288C and 12 h dark at 248C cycles. For induction, plants were treated with 1 mL 0.2% methyl jasmonate per plant and incubated for 72 h, while control plants were only treated with the 0.1% Tween 20 carrier solution. The leaves were then clipped off (2 g), frozen, and ground to powder in liquid nitrogen. This plant material was then extracted with 50 mL ethyl acetate by stirring overnight at room temperature. The mixture was clarified by centrifugation, and the ethyl acetate extract dried under nitrogen gas and redissolved in 0.1 mL 50% methanol/water, from which 10 mL was injected for LC-MS/MS analysis. The LC-MS/MS analysis was performed using an Agilent ZORBAX Eclipse XDB-C8 column (4.6 3 150 mm, 5 mm) and a 0.5 mL/ min flow rate and the same elution program described above for purification. Purified compounds were easily detected and used to optimize selective MS/MS ion monitoring. Specifically, for C11a-hydroxy-entcassadiene, the base peak (m/z = 271 [MH-H 2 O] + ) from the molecular ion (m/z = 289 [MH] + ) was selected for further fragmentation, with the resulting mass spectra compared with that found in the induced rice leaf extract at the same retention time.

Phylogenetic Analysis
Molecular phylogenetic analysis was performed with the CLC Sequence Viewer (Version 6.2) software package (CLC Bio). The alignments were performed using the very accurate alignment option, with a gap open cost of 10, gap extension cost of 1, and the end gap cost set to cheap. The phylogenetic trees were built using the neighbor-joining algorithm with bootstrap analysis of 1000 replicates, leading to the values shown for each node in Figure 6. For clarity, in Figure 6C, the root was set to the node above the CYP701A6 outgroup.

Supplemental Data
The following materials are available in the online version of this article.