- © 2013 American Society of Plant Biologists. All rights reserved.

## Abstract

Understanding metabolic acclimation of plants to challenging environmental conditions is essential for dissecting the role of metabolic pathways in growth and survival. As stresses involve simultaneous physiological alterations across all levels of cellular organization, a comprehensive characterization of the role of metabolic pathways in acclimation necessitates integration of genome-scale models with high-throughput data. Here, we present an integrative optimization-based approach, which, by coupling a plant metabolic network model and transcriptomics data, can predict the metabolic pathways affected in a single, carefully controlled experiment. Moreover, we propose three optimization-based indices that characterize different aspects of metabolic pathway behavior in the context of the entire metabolic network. We demonstrate that the proposed approach and indices facilitate quantitative comparisons and characterization of the plant metabolic response under eight different light and/or temperature conditions. The predictions of the metabolic functions involved in metabolic acclimation of *Arabidopsis thaliana* to the changing conditions are in line with experimental evidence and result in a hypothesis about the role of homocysteine-to-Cys interconversion and Asn biosynthesis. The approach can also be used to reveal the role of particular metabolic pathways in other scenarios, while taking into consideration the entirety of characterized plant metabolism.

## INTRODUCTION

As sessile organisms, plants have to adjust to ever-changing environmental conditions to ensure proper development and growth, leading to higher fitness (Guy, 1999; Dodd et al., 2005; Mittler, 2006; Athanasiou et al., 2010; Skirycz and Inzé, 2010; Zhen et al., 2011). If the changes in the environment are unfavorable, causing significant reduction in growth and yield, the conditions are regarded as stresses (Levitt, 1980; Vierling and Kimpel, 1992). Environmental stresses can be divided into abiotic stresses, including inadequate levels of physical determinants, such as temperature, light quality and intensity, water availability, and nutrient supply, and biotic stresses, caused by other organisms. Plants have developed two types of stress response mechanisms to deal with environmental stresses, namely, stress avoidance and stress tolerance (Levitt, 1980). Stress avoidance includes the collection of protective mechanisms that reduce the negative effects of the stress, leading to stable and inherited response patterns termed adaptation. Stress tolerance comprises the set of mechanisms allowing plants to adjust to an adverse condition, referred to as acclimation, which can be reversed if the stress does not persist. The effects of environmental stresses on plants are not confined to single subsystems or levels of cellular organization and involve simultaneous physiological alterations of gene expression (Chinnusamy et al., 2007; Kilian et al., 2007; Krouk et al., 2009), epigenetic regulation (Hauser et al., 2011; Zhu et al., 2012), signaling (Chen and Zhu, 2004; Hirayama and Shinozaki, 2010), as well as primary and secondary metabolism (Cook et al., 2004; Kaplan et al., 2004; Usadel et al., 2008; Caldana et al., 2011; Ramakrishna and Ravishankar, 2011).

Biotic and abiotic stresses account for tremendous losses to agronomic yield worldwide (Skirycz and Inzé, 2010). Given that plant growth and survival are directly linked to metabolism (Stitt et al., 2010), it follows that understanding of the mechanisms employed in metabolic acclimation can lead to improved crop breeding strategies. Prediction of metabolic behaviors involved in or reflecting acclimation is all the more important as this response has been linked to acquired stress tolerance, whereby exposure to moderate stresses can improve the overall stress tolerance of plants (Levitt, 1980; Thomashow, 1999; Gurley, 2000). The systems-wide effects of plant acclimation thus challenge the development of computational methods for in silico characterization of the implicated mechanisms based on genome-scale models. Such an approach has the promise not only of reducing the cost and duration of the labor-intensive experiments but also of providing an initial step toward designing metabolic engineering strategies for the mechanisms (i.e., reactions, pathways) predicted to be involved in acclimation.

Genome-scale models of metabolism in combination with optimization-based methods offer an approach for examining metabolic behaviors by predicting flux distributions which obey physico-chemical constraints (e.g., mass balance and thermodynamics) (Lewis et al., 2012). The surge of computational approaches to dissect the genotype-phenotype relationship has led to predictions of flux distributions in accordance with experimental measurements, particularly in the model bacterium *Escherichia coli* (Varma and Palsson, 1994). Research efforts have also resulted in high-quality reconstructions of genome-scale metabolic models for photosynthetic organisms and their organs, including *Arabidopsis thaliana* (Poolman et al., 2009; de Oliveira Dal’Molin et al., 2010), barley (*Hordeum vulgare*) seeds (Grafahrend-Belau et al., 2009), rape (*Brassica napus*) seeds (Hay and Schwender, 2011; Pilalis et al., 2011), maize (*Zea mays*) (Dal’Molin et al., 2010; Saha et al., 2011), as well as the green algae *Chlamydomonas reinhardtii* (Boyle and Morgan, 2009; Chang et al., 2011) and cyanobacteria (Knoop et al., 2010; Montagud et al., 2010). These models have recently been used to investigate different biological problems, from the relationship between retaining gene duplicates from whole-genome duplications and reactions of high metabolic flux (Bekaert et al., 2011) to the metabolic and evolutionary cost of herbivore defense mechanisms (Bekaert et al., 2012). A drawback of these models is that they only consider a single cell type, suspension cultures, or simply neglect the presence of multiple cell types in plant organs or tissues (Sweetlove and Ratcliffe, 2011). However, a recent genome-scale network reconstruction of *Arabidopsis* (Mintz-Oron et al., 2012) accounts for different tissues and covers primary as well as secondary metabolism. Such extension facilitates genome-scale model–based studies of the versatile roles of the components involved in secondary metabolism (Kliebenstein, 2013).

Feasible flux distributions in these (flux-centric) models are obtained under the assumption that the system operates toward optimizing a suitable, easy-to-measure objective function (e.g., growth rate/biomass production) (Lewis et al., 2012). Nevertheless, assessments of different systemic objectives in combination with fluxes, estimated from labeling studies, have recently suggested the hypothesis that physiological flux distributions may operate in the space defined by competing objectives (Schuetz et al., 2012). The issue of multiple feasible flux distributions is usually addressed either by integrating high-throughput data (e.g., transcriptomics and proteomics) (Shlomi et al., 2008; Colijn et al., 2009; Yizhak et al., 2010) or by considering subsequent optimization steps involving, for example, minimization of total flux or photon use (Holzhütter, 2004; Shastri and Morgan, 2005). Although transcriptomics data do not directly relate to fluxes, due to posttranscriptional and translational modifications, they can effectively constrain the flux space to reflect the physiological state better than the generic model-imposed bounds.

Here, we used a compartmentalized model of *Arabidopsis* (Mintz-Oron et al., 2012) in combination with time-resolved transcriptomics data (Caldana et al., 2011) capturing the acclimation to changing temperature and light conditions to predict metabolic functions (also referred to as pathways) involved in the stress response. Two challenges had to be addressed: (1) growth/biomass are not necessarily primary systemic objectives under changing environmental conditions; and (2) cellular processes downstream of transcription known to affect the distribution of enzymes in the metabolic network (referred to as enzymatic setup) need to be considered. To this end, instead of a single objective function, we considered a collection of metabolic functions (see Supplemental Data Set 1 online). We characterized their capacity profiles under different environmental conditions by integrating transcriptomics data, acting as proxy for the enzymatic setup, with an optimization-based method. The proposed computational method, in combination with an adequately designed null model, allowed us to determine which metabolic pathways, from both primary and secondary metabolism, were significantly affected in the experiments. The statistical analysis based on permutations of transcriptomics-based flux boundaries revealed the existence of two types of metabolic functions, referred to as metabolic modulators and metabolic sustainers. Furthermore, we proposed three optimization-based indices, termed priority, dependency, and efficiency, which characterize different aspects of the metabolic functions in the context of the entire metabolic network. We demonstrate that the proposed indices can readily be used to further dissect the relationship between sustainers and modulators under challenging environments. Furthermore, these indices allow the quantitative comparison of environmental conditions based on the behavior of metabolic functions and as such could be used as rational design strategies for metabolic engineering.

## RESULTS AND DISCUSSION

### Maximization of Metabolic Functions under a Null Model for Distribution of Transcripts

To explore which metabolic functions were significantly affected in response to changing temperature and light intensities, we first determined the capacity profile for each metabolic function with flux boundaries established from the transcriptomics data (see Methods). Our study relied on eight transcriptomics time series covering 23 time points, from both linear and logarithmic scales, capturing combinations of environmental conditions, with four light intensities (ranging from 0 to 300 μE m^{−2} s−^{1}) and three temperature regimes (4, 21, and 32°C) (Caldana et al., 2011). A schematic description of our approach is presented in Figure 1. We employed the E-Flux method (Colijn et al., 2009) to specify lower and upper reaction flux boundaries as a function of the transcriptomics data (see Methods). The resulting time- and condition-specific transcriptomics-constrained metabolic models approximate the enzymatic setup and, thus, the physiological state of the underlying metabolic network. We then investigated the capacity of 167 manually curated metabolic functions, covering pathways from primary and secondary metabolism (see Methods and Supplemental Data Set 1 online). The capacity of a metabolic function *I* at time *t* and condition *p* was obtained by solving the following linear program (LP), as in general flux balance analysis (FBA; see Methods):where *c _{I}* is the vector of coefficients for the metabolic function

*I*,

*v*is the vector of fluxes, and

*S*is the stoichiometric matrix of the network, while

*tr*and

_{min}*tr*are the transcriptomics-based lower and upper flux boundaries for each reaction, respectively. The capacity profile of a metabolic function over

_{max}*N*time points, , is then given by . By casting the problem in the optimization-based framework, this approach overcomes the observed drawback that flux solutions usually involve only reactions from primary metabolism, although the underlying model is genome-scale (Sweetlove and Ratcliffe, 2011). Furthermore, the issue of multiple flux optima does not arise in our setting, since we only considered the value of the LP rather than the associated flux distribution(s). Moreover, it should be noted that the capacity of a metabolic function cannot be regarded as a prediction of an actual flux distribution but rather reflects a maximum theoretical flux value the pathway may carry under the investigated condition.

To evaluate the significance of the average capacity of the metabolic function *I* over the time domain *T*, we compared it to the average capacity obtained from metabolic networks under a null model for distribution of transcripts. To this end, we first divided the reactions of the network into three classes: (1) exchange, (2) irreversible, and (3) reversible reactions. The exchange reactions delineate the boundaries of the model and determine which metabolites are exchanged (i.e., imported/exported) with the environment. The flux boundaries of these reactions remained unaltered to avoid random changes arising from reasons other than the changes in light and temperature. The transcriptomics-based flux boundaries of the remaining reactions were obtained by permutations within each one of the classes. Therefore, this formulation of the null model keeps the physiologically meaningful properties (model environment, thermodynamic constraints, and the enzymatic setup) unchanged. This randomization procedure was repeated *B* = 100 times for each model, allowing *z*-scores to be calculated (under the assumption that the maximum values of metabolic functions are normally distributed):where is the average capacity of the metabolic function *I* over the time domain *T* in condition *p* with transcriptomics-based flux boundaries without randomization, while and are the average capacity and standard deviation of over the time domain *T* in condition *p*, respectively, from *B* randomizations. Therefore, the integration of transcriptomics data and genome-scale models in combination with the proposed null model can be used to make statistically sound statements about the behavior of metabolic functions from a single experiment. This is possible due to the investigation of the behavior of metabolic functions, as groups of enzymatic reactions, in their larger network context. The statistical claims clearly pertain to the situations of random distribution of transcriptomics-based flux bounds and could therefore reveal principles of functional metabolic organization. However, this does not necessarily imply that the value of a function in an investigated condition is up/downregulated, as it is classically performed in the analysis of differential expression. Instead, differential behavior is considered with respect to the systemic distribution of gene expression: It implies that the metabolic function in a given setting operates differently from what is expected to occur by chance.

### Differential Metabolic Functions Act as Either Modulators or Sustainers

With the help of Equations 1 and 2, we defined two types of metabolic functions for each condition: (1) those whose average capacity over the considered time domain is significantly smaller (*z*-scores < −2) than the average capacity from the randomizations, referred to as modulators of the metabolic state, and (2) those whose average capacity over the considered time domain is significantly larger (*z*-scores > 2) than the average capacity from the randomizations, termed sustainers of the overall metabolic objective(s). Thus, a metabolic function that acts as a modulator has an average capacity that is smaller than the average obtained from a random distribution of transcripts. This implies that a metabolic modulator is expected to have tightly regulated fluxes to ensure particular concentrations of downstream metabolites, for example, those acting as signaling molecules. Therefore, the removal of the regulatory mechanisms, reflected in the global gene expression levels, would result in the possibility of carrying larger fluxes. By contrast, a metabolic function that acts as a sustainer has an average capacity that is greater than the average obtained from random distributions. This implies that a metabolic sustainer is expected to take full advantage of a realistic condition; thus, it is regulated toward higher maximum values than those expected to arise from a null model. The absolute value of a *z*-score > 2 corresponds to a P value < 0.02. As no statistically sound statement could be made for the remaining metabolic functions, which exhibited *z*-scores outside of the considered ranges, they were not further considered.

Our analysis resulted in the identification of 37 metabolic functions, from a total of 167 considered, with differential capacity (i.e., absolute value of *z*-score > 2) in at least one but no more than seven conditions, as shown in Table 1. Most importantly, each of the 37 differential metabolic functions acted either as sustainer or modulator across the considered conditions. Thus, if a function was found to be a modulator in one condition, it was not declared to be a sustainer under any of the other conditions. This highlights the strength of this approach for revealing principles of metabolic organization for the considered metabolic functions. A pathway map of these 37 functions is shown in Figure 2, and visualization of their condition-specific profiles is presented in Supplemental Figure 1 online.

### Condition-Independent Modulators and Sustainers

Of the aforementioned 37 metabolic functions, 12 exhibited differential capacity in seven of the investigated conditions with respect to their expected behavior in the null model. Therefore, the behavior of these metabolic functions pointed to a putative role in acclimation under any of the seven conditions. Six of these differential metabolic functions were predicted to be sustainers, with strong experimental evidence supporting their indispensability under the considered conditions. The methylerythritol 4-phosphate pathway synthesizes the precursors for an astonishing diversity of plastid isoprenoids, including the major photosynthetic pigments chlorophylls and carotenoids (Cordoba et al., 2009), which are necessary for sustaining photosynthesis under adverse conditions. The pathway is essential in plants despite the presence of the independent parallel cytosolic mevalonate pathway, which also generates the universal five-carbon precursors of isoprenoids (Bouvier et al., 2005). Similarly, urate degradation to allantoate, and its subsequent breakdown to ureidoglycolate and glyoxylate, are required for recycling the nitrogen resident in the purine ring in plants (Werner et al., 2008), likely via the process of photorespiration, which additionally aids in carbon recycling. In keeping with the assigned role of urate metabolism, ureide accumulation was implicated in abiotic stress responses, particularly in legumes, although the physiological role of these metabolic functions in *Arabidopsis* remains to be investigated in detail (Werner and Witte, 2011). Finally, the homocysteine-to-Cys interconversion was additionally identified as a sustainer. Although Cys-rich proteins are clearly important for phytochelation (Cobbett and Goldsbrough, 2002), the exact role of this pathway under the biological circumstances under study here remains unclear. This observation thus prompts an interesting hypothesis and suggests that further experimentation should be performed to determine the exact role of this pathway in the acclimation to temperature and/or light stress.

Moreover, the remaining six differential metabolic functions were predicted to be modulators. For instance, indole-3-acetyl-amino acid and indole-3-acetic acid, the most abundant, basic auxins naturally occurring in plants (Zhao et al., 2002), modulate the response to abiotic stresses in rice (*Oryza sativa*; Jain and Khurana, 2009). Abiotic stresses induce nitric oxide (NO) production in plants, which in turn alleviates the harmfulness of reactive oxygen species (ROS). It is well established that abiotic stress can result in oxidative stress through the formation of ROS, which are highly destructive to lipids, nucleic acids, and proteins (Mittler, 2002). In animals, NO is principally generated by three nitric oxide synthases (NOS) with different localizations and functions, all of which convert l-Arg to l-citrulline and NO (Qiao and Fan, 2008). A *NOS1* gene was isolated from *Arabidopsis*, encoding a protein localized in mitochondria and associated with NO synthesis (Guo et al., 2003; Guo and Crawford, 2005). The involvement of At*-NOS1* in NO biosynthesis and accumulation is somewhat controversial; At*-NOS1* might function either indirectly or as a regulator of this process (Zemojtel et al., 2006). This observation nonetheless provides support for the role of citrulline biosynthesis as a molecular modulator. Moreover, our findings prompted the hypothesis that Met degradation to both methanethiol and 2-oxbutanoate is involved in stress acclimation under the considered conditions, which could be used as a basis for future experiments under similar conditions.

### Condition-Specific Modulators and Sustainers

The next 17 metabolic functions, including seven sustainers and 10 modulators, showed differential capacity in at least three conditions. The sustainers included *myo*-inositol biosynthesis, β-carotene biosynthesis, β-Ala biosynthesis from two different precursors (i.e., spermidine and spermine), UDP-d-galacturonate biosynthesis, chorismate biosynthesis, and photorespiration. Raffinose family oligosaccharides, such as raffinose, stachyose, and verbascose, accumulate in leaves of plants experiencing cold and heat stresses (Castonguay and Nadeau, 1998; Cook et al., 2004; Kaplan et al., 2004; Usadel et al., 2008). Biosynthesis of raffinose family oligosaccharide is initiated by the formation of galactinol from *myo*-inositol and UDP-Gal by galactinol synthase. The metabolism of galactinol and UDP-Gal is closely linked to that of UDP-d-galacturonate biosynthesis given the close linkages between *myo*-inositol, Gal, and galacturonate metabolism (Gilbert et al., 2009). On the other hand, polyamines are small aliphatic molecules positively charged at cellular pH. Spermidine and spermine are the most common polyamines in land plants and are produced from either Orn or Arg (Krasensky and Jonak, 2012). Various stresses modulate levels of polyamines, and higher levels are positively correlated with stress tolerance (Usadel et al., 2008; Alcázar et al., 2011). Moreover, recent experiments suggest that Ser hydroxymethyltransferase, which is involved in the photorespiratory pathway, plays a critical role in alleviating the cell damage provoked by abiotic stresses, such as high light and salt (Moreno et al., 2005). This supports the notion that photorespiration participates in the dissipative mechanisms of plants to minimize production of ROS at the chloroplast so as to mitigate oxidative damage. β-Carotene and chorismate provide the building blocks for synthesis of abscisic acid (Tuteja, 2007) and salicylic acid (Strawn et al., 2007), respectively, both of which accumulate during cold and heat stress (Clarke et al., 2004; Scott et al., 2004; Tuteja, 2007). Moreover, our predictions suggest that chorismate biosynthesis does not show differential behavior under standard conditions when, indeed, salicylic acid is only present in trace amounts (Strawn et al., 2007).

The modulators consisted of trehalose, Orn, Asn, and Gly biosynthesis as well as the fermentation of pyruvate to form ethanol. In plants, trehalose is synthesized in a two-step process, whereby trehalose-6-phosphate synthase generates trehalose-6-phosphate from UDP-Glc and Glc-6-phosphate followed by dephosphorylation to trehalose by trehalose-6-phosphate phosphatase (Paul et al., 2008). Genetically engineered plants provide evidence that enhanced trehalose metabolism can positively regulate stress tolerance (Avonce et al., 2004; Suzuki et al., 2008; Li et al., 2011). Orn is the intermediate compound in the Arg biosynthetic pathway, where the pathway diverges to the production of compounds, such as Pro and polyamines, which serve osmoprotective functions and improve stress tolerance (Kalamaki et al., 2009). Glycinebetaine is synthesized either by oxidation of choline or by *N*-methylation of Gly (Chen and Murata, 2002). This compound accumulates in *Arabidopsis* expressing *N*-methyltransferase under abiotic stress conditions (Waditee et al., 2005). Glycinebetaine is thought to protect photosystem II, stabilize membranes, and mitigate oxidative damage (Chen and Murata, 2011). The ethanolic fermentation pathway branches off the main glycolytic pathway at pyruvate. In the first step, pyruvate is the substrate of pyruvate decarboxylase, yielding carbon dioxide and acetaldehyde. Subsequently, acetaldehyde is reduced to ethanol with the concomitant oxidation of NADH to NAD^{+} by alcohol dehydrogenase. Ethanolic fermentation has been implicated in stress signaling and response to abiotic stresses other than anoxia (Tadege et al., 1999), as demonstrated in seedlings of *Arabidopsis* (Kürsteiner et al., 2003). Although the experimental support suggests the induction of pyruvate decarboxylase under cold conditions in seedlings, our predictions suggest that this is not the case when cold is combined with darkness in a developed plant. Despite recent weak evidence suggesting that Asn metabolism may be affected under conditions of darkness/starvation (Baena-González and Sheen, 2008), our findings, like those for Cys above, indicate that direct experimentation, to elucidate its role under the conditions tested here, would likely prove highly informative.

### Optimization-Based Indices for Comparison of Metabolic Functions

The proposed null model allowed us to examine whether a single metabolic function may be involved in the metabolic adjustment to changing conditions. However, the predicted capacities of metabolic functions could not be readily compared due to the difference in sizes (i.e., number of reactions) of metabolic functions and the lack of a reference value. To enable a comparative analysis of metabolic capacities, we developed three optimization-based indices that quantify (1) priority (i.e., how optimal a given metabolic function is with respect to the total enzyme capacity); (2) dependency (i.e., how much the metabolic function is affected by the enzymatic setup of the network); and (3) efficiency (i.e., how efficiently the setup of the metabolic function is used). These optimization-based indices were derived by solving modifications of the constraint-based problem in Equation 1, generally captured by the mathematical program in Equation 3:where *I* is a metabolic function, *R* is a given subset of reactions, while v_{min} and v_{max} are the default model boundaries, taking values from the interval [−1000, 1000] or [0, 1000] if the reaction is reversible or irreversible, respectively. Like in FBA with crowding (Beg et al., 2007), *E* denotes the sum of all upper transcriptomics bounds and serves as a proxy for the total enzyme capacity. The time series of the approximated total enzyme capacity for all eight environmental conditions is shown in Figure 3. We observed that high temperature and darkness led to the strongest decrease in total enzyme capacity compared with the remaining conditions, which exhibited largely stable enzyme capacity over time. However, this does not imply that the ratio of the transcripts associated with a metabolic function to the total enzyme capacity decreases with time and thus does not affect the comparative analysis.

Under the assumption that the total enzyme capacity *E* can be freely distributed among the reactions of the network, we considered the following three scenarios for a given metabolic function *I*: (1) if *R* includes all reactions in the network, the program determines the capacity *C ^{I}* with default flux bounds for all reactions; (2) if

*R*= Ø, the program results in the capacity with transcriptomics-based bounds for all reactions; and (3) if

*R*=

*I*, the program yields the capacity in which all reactions, except those in the metabolic function

*I*, have transcriptomics-based bounds. The priority index for the metabolic function

*I*is then given by ; a value closer to one indicates that the metabolic function is used at its maximum, while a value closer to zero denotes that enzyme capacity is used for other metabolic functions. The dependency index is defined by the ratio and captures the effect of the rest of the metabolic network on the metabolic function. A value close to one indicates that either the transcriptomics-based flux bounds of the metabolic function under consideration are close to the default bounds or the network conditioning through the data integration largely determines the capacity of the metabolic function. By contrast, a low value indicates independence of the capacity of the considered metabolic function from the rest of the network and a strong dependence on the flux bounds of the metabolic function itself. Finally, the efficiency index , with , quantifies how well the given enzymatic setup of the metabolic function is used: A value close to one indicates more efficient usage, whereas a value close to zero denotes inefficiency. A schematic representation of the definition of the three different optimality indices is shown in Figure 4.

### Optimization-Based Indices Reveal Temporal Differences between Metabolic Functions

Optimization-based indices were determined for every time point and condition, resulting in condition-specific profiles for each index. To determine if a common trend could be observed in the time-resolved profiles of the indices over the 167 metabolic functions, we used the RV coefficient (Smilde et al., 2009). The RV coefficient relies on the similarity transformation of two compared matrices and can be used to determine (1) if the similarities in metabolic functions with respect to a considered index are preserved between conditions (termed metabolic RV) and (2) if there are similarities in the temporal trend of each optimality index between conditions (termed temporal RV; see Methods). As the average metabolic RV coefficients over all pairs of conditions for each of the indices are significant and larger than 0.94, similarities in metabolic functions (with respect to each index) are preserved between conditions. This highlights the fact that the clustering of metabolic functions, based on the indices, remains largely unchanged between the different treatment groups. However, this does not imply that the temporal trend of each index is preserved between conditions. On the contrary, we obtained values of 0.73, 0.74, and 0.77 for the average temporal RV coefficients over all pairs of conditions for the efficiency, dependency, and priority index, respectively. This indicates that changes in the investigated conditions are reflected in the temporal profiles of the proposed indices. A heat map representation of the RV coefficients for both measures and all three indices is provided in Figure 5.

### Optimization-Based Indices for Modulators and Sustainers

For the three optimization-based indices, we next determined whether their averages over the differential metabolic functions differ from the average over the remaining metabolic functions. In addition, we examined whether there was a difference in the temporal averages of the indices between metabolic modulators and metabolic sustainers under each condition. The 37 metabolic functions showing differential capacity have, on average, smaller priority but larger dependency than the remaining functions under all conditions (P value < 0.05, Welch *t* test). In other words, while, on average, the metabolic functions of differential capacity are not used to their maximum, their capacities are highly dependent on and regulated by the entire network. The latter further supports the assumption that the predicted metabolic functions are key processes involved in the metabolic response to the examined conditions. Moreover, metabolic modulators have, on average, smaller priority and dependency but larger efficiency than those of the metabolic sustainers under all considered conditions (P value < 0.05, Welch *t* test). This supports the assumption that modulators are characterized by tightly regulated fluxes that require more efficient usage of their enzymatic allocation.

### Comparative Analysis of Conditions Based on the Indices

Next, we investigated the temporal behavior for the three optimization-based indices in the considered scenarios. A heat map visualization of the time series of the three indices for the 37 differential metabolic functions is shown in Figure 6.

First, we considered the profiles for the priority index under dark conditions. Closer inspection separated two groups based on their priority under 32°C darkness. The first group showed a strong decrease in priority under this condition and the same, but weaker, behavior for 21°C darkness, while remaining relatively stable under all other conditions. These pathways comprised urate degradation to allantoin, methylerythritol phosphate pathway, homocysteine and Cys interconversion, allantoin degradation, *myo*-inositol biosynthesis, β-carotene biosynthesis, Orn, Asn, Gly, His, and Glu biosynthesis, photorespiration, and 1,4-dihydroxy-2-naphthoate, pantothenate, and choline biosynthesis. Although this group included both sustainers and modulators, it must be noted that eight out of nine differential metabolic functions were sustainers in more than five conditions.

The second group demonstrated the opposite behavior: an increase in priority for both 32°C and 21°C darkness. This group contained Met degradation, citrulline, threhalose, Gly, UDP-d-galacturonate biosynthesis, xylose degradation, and the oxidative branch of the phoshopentose pathway. Here, five of the six metabolic functions that showed differential behavior in more than five conditions were modulators. Furthermore, citrulline biosynthesis showed decrease in priority under both cold stress conditions.

The remaining 13 metabolic functions could be further divided into sub-groups. Pathways involved in auxin synthesis showed a general trend of an initial increase and subsequent decrease under all conditions. All of them were characterized as modulators. The pyruvate fermentation to ethanol pathway showed a cold stress–specific increase in priority. β-Ala biosynthesis pathways showed a darkness-specific decrease in priority, while all three pathways involved in glutathione biosynthesis and degradation exhibited a general increase under all conditions.

Regarding the dependency of the metabolic functions from the network, we found that the grouping based on the dependency index showed partial overlap with the grouping based on the priority index. Three metabolic functions, namely, trehalose, β-carotene biosynthesis, and photorespiration, showed a constant index, with a value of 1, under all conditions. This indicated that the flux through the respective functions is fully constrained by the remainder of the underlying network. Furthermore, any increase of the upper flux boundaries of these metabolic functions would not increase the flux through the respective pathway. Allantoin degradation had a constant value in all experiments; only under 21°C and high-light conditions did we observe a loss of dependency at the late stage of the experiment.

From the group of pathways that showed a decrease in priority under high temperature and darkness, we found eight pathways, namely, methylerythritol phosphate pathways, *myo*-inositol, Asn and His biosynthesis, and 1,4-dihydroxy-2-naphthoate, pantothenate, and choline biosynthesis, which exhibited similar behavior with respect to their dependency. This suggests that a downregulated pathway is likely to become less dependent on the enzymatic setup of the underlying network over time.

From the group of pathways that showed an increase under high temperature and darkness, we observed the same response pattern for the dependency of the Met degradation pathways, Gly biosynthesis pathways, and the oxidative branch to pentose pathway. Furthermore, the cold-specific response of pyruvate fermentation to ethanol was also conserved. Interestingly, Orn, ferulate, and sinaptate biosynthesis all showed a decrease in dependency under all of the environmental conditions studied.

Finally, when we examined the efficiency of the 37 differential metabolic functions, we determined that indole-3-acetyl-amino acid biosynthesis (from indole-3-acetate), Asn, ferulate, and sinaptate, and Glu biosynthesis exhibited a constant efficiency of 1, indicating an optimal usage of the pathways enzyme capacity. For the remaining metabolic functions, we again found that their response patterns were similar to the profiles from the priority index. The profiles based on the efficiency index contain two groups that overlap those based on the priority index. The first group of pathways included urate degradation to allantoin, methylerythritol phosphate pathway, homocysteine and Cys interconversion, allantoin degradation, Orn, Gly, and His biosynthesis, photorespiration, 1,4-dihydroxy-2-naphthoate, pantothenate, and choline biosynthesis, while the second group consisted of citrulline biosynthesis, trehalose biosynthesis, pyruvate fermentation to ethanol, *myo*-inositol biosynthesis, UDP-d-galacturonate biosynthesis, xylose degradation, chorismate biosynthesis, and the oxidative branch of the phoshopentose pathway. Furthermore, the β-Ala biosynthesis pathways and all three pathways involved in glutathione biosynthesis and degradation showed similar priority and efficiency patterns.

Nevertheless, some pathways showed efficiency profiles that were the opposite of their respective priority profiles. While Met degradation and Gly biosynthesis had an increasing priority under high temperature and darkness, they exhibited a lower efficiency toward the end of those time series, indicating that the newly obtained enzymatic setup might not have reached optimal efficiency.

Altogether, the priority, dependency, and efficiency indices under the eight conditions investigated provide the means for a systematic analysis of the relationships between the differential metabolic functions and the remainder of the underlying metabolic network. Moreover, they highlight the multifaceted behaviors of metabolic functions, namely, two functions that exhibit the same behavior regarding one index may strongly diverge with respect to another index. This would not have been revealed solely by investigating the similarities between transcript profiles using classical analyses.

### Future Challenges

Understanding the role of metabolism in acclimation to abiotic stresses is of great importance for genetic/metabolic engineering of plants with improved ability to cope with unfavorable conditions. Moreover, this type of crop improvement is of increasing importance as many regions of the world are experiencing environmental deterioration and increasing demands on arable land (e.g., due to biofuel production). To this end, it is imperative to quantify the extent to which the functioning of particular metabolic pathway(s) may be associated with acclimation not only to a single condition but also to multiple challenging environmental conditions. As metabolic pathways do not operate in isolation under any growth condition, it is necessary to characterize their role in acclimation in the appropriate cellular context. Here, we propose an optimization-based approach, which can be used to determine differential metabolic functions with respect to an adequate null model, based on coupling genome-scale metabolic networks with transcriptomics data. Most importantly, differential metabolic functions are determined based on a single experiment. By focusing on the metabolic functions exhibiting differential behavior under some but not all of the investigated conditions, one could then propose that these functions play a putative role in acclimation.

Interestingly, *z*-scores resulting from the statistical analysis based on the null model translated into two classes of metabolic functions, which we subsequently defined as sustainers and modulators, with respect to stress acclimation. The results obtained under the investigated stress conditions prompted the hypothesis that, due to the wiring of the metabolic network, each metabolic function plays the role of either sustainer or modulator irrespective of the conditions. In other words, if a metabolic function was determined to be a modulator (or a sustainer) under one condition, under the remaining conditions it either showed the same type of differential behavior or was declared nondifferential. Due to the number of conditions we investigated, this finding suggests an interesting aspect of functional metabolic organization as well as interaction between the cellular levels, namely, transcriptional regulation and metabolism, in plants.

To dissect the role of the differential metabolic functions in acclimation, we also proposed three optimization-based indices, which allow for comparative quantitative analyses across the conditions investigated. The analysis of the eight conditions supports the conclusion that complex and intricate interrelations exist between sustainers and modulators, which arise from the network context. These findings would not be possible from analysis of differential expression and clustering of genes, since in isolation they neglect the network context. We believe that our approach adds additional metabolism-related insights to the analysis of large-scale transcriptomics data sets with well-established computational methods (e.g., analyses of differential behavior, overrepresentation of functions, and similarity-based networks) (Bassel et al., 2011, 2012; Blair et al., 2012; Windram et al., 2012). A possible extension to this approach would be to investigate the coupling between data profiles of metabolites participating in the metabolic functions investigated and the findings with respect to metabolic sustainers/modulators together with their respective optimization-based indices. Finally, our network-driven optimization-based approach sets the stage for placing high-throughput data in the network context from which they arise to reveal the role of well-investigated pathways in the functioning of characterized metabolism as a whole.

## METHODS

### Data

Previously published data were employed in this analysis (Caldana et al., 2011). In brief, wild-type *Arabidopsis thaliana* Columbia-0 plants were grown in soil for 6 weeks and then transferred to the following conditions: 21°C, 150 μE; 21°C dark; 4°C 85 μE; 4°C dark; 32°C, 150 μE; 32°C dark; 21°C, 75 μE; and 21°C, 300 μE. For each of the eight different light and temperature conditions, rosettes were harvested and immediately frozen in liquid nitrogen.

### Model

We used a recent network reconstruction of *Arabidopsis* (Mintz-Oron et al., 2012), a compartmentalized model that includes 1078 metabolites and 1363 reactions, of which 1194 are compartmentalized and 772 are transporters. For 1065 of these reactions, the gene–reaction association is included in the model.

### FBA

FBA and its derivatives are state-of-the-art methods to model large-scale metabolic networks. FBA relies solely on the stoichiometry of the involved reactions and an assumed objective function (i.e., a relation of reaction fluxes); thus, this analysis does not require any kinetic information. The stoichiometric relationships of the reactions in a given network are captured in the stoichiometric matrix *S*. The *m* rows of *S* denote the *m* metabolites, and the *n* columns correspond to the *n* reactions in the network. An entry *x* at position (*y,z*) denotes that *x* molecules of metabolite *y* participate in reaction *z*. Positive or negative entries in the matrix denote that the corresponding metabolite is produced or consumed in the reaction, respectively. Under the assumption that the system is in a steady state, the fluxes through the system obey the following relation *S* ⋅ *v* = 0, where *v* is a vector of fluxes of length *n*. As the resulting system of linear equations is usually underdetermined, a flux distribution through the system is calculated under the assumption that the system optimizes a certain objective function (e.g., the maximization of growth for microorganism). Further constraints on the fluxes arise from the given flux boundaries, *v _{min}* and

*v*. For example, if a reaction is irreversible, its lower boundary is set to 0. Usually, all reactions of a model are assigned default boundaries of (−1000, 1000) and (0, 1000) for reversible and irreversible reactions, respectively. Together, these constraints result in the following optimization problem cast in what is referred to as an LP formulation:

_{max}### Transcriptomics Data Integration

The integration of transcriptomics data in the genome-scale metabolic network was performed by applying the E-Flux method (Colijn et al., 2009). This approach constrains the allowed flux boundaries of an FBA problem by integrating gene expression data in a continuous way (i.e., without imposing any thresholds). The underlying assumption of the method is that there is a correlation between the transcript abundance of a certain gene and the maximum flux through the respective reaction. In our approach, we used a linear mapping function between the expression data and the flux boundaries. This implies that if a certain gene is not expressed, the flux through the respective reaction is constrained to zero. On the other hand, if a gene has a measured expression level of value *b*, the flux through the respective reaction is allowed to take any value between [*–b*, *b*] and [0, *b*] for the reversible and irreversible cases, respectively. Furthermore, if a reaction of the network was annotated with several genes, we applied the following rules: we used the *min* operator if the genes are connected by an AND statement and the *max* operator if the genes are connected by an OR statement. Applying this approach, 46% of 1363 reactions could be bounded by transcriptomics data. Reactions for which no gene annotation exists or no gene expression values were measured were assigned the highest/lowest of all transcript-based flux boundaries for the upper/lower bounds.

### Metabolic Functions

We applied our approach to the list of metabolic functions shown in Supplemental Table 3 in Mintz-Oron et al. (2012). The pathways were manually curated, and pathways for which none of the included reactions could be assigned transcript-bounded flux boundaries were excluded from the analysis. A complete list of all metabolic pathways, as well as all reactions included, is presented in Supplemental Data Set 1 online.

### Statistical Analysis

To capture the congruence between (1) metabolic functions with respect to a considered index and (2) the temporal trend of each optimality index between conditions, the RV coefficient was determined for (1) (*X*, *Y*) and (2) (*X ^{T}*,

*Y*), where

^{T}*X*and

*Y*are the matrices of optimization-based indices for two given conditions, in which the rows denote the metabolic functions and the columns indicate the time points.

### Software

The data integration and optimization was performed using Matlab (The MathWorks) and the optimization platform Tomlab 7.6 (Holmström, 1998). All optimization problems were solved using the mathematical programming solver Gurobi (Gurobi Optimization). Computation of the RV coefficients was performed using the statistical computing language R (R Development Core Team, 2011) and the function “coeffRV” from the library “FactoMineR” (Mazet et al., 2013), a package for multivariate exploratory data analysis.

### Accession Numbers

Sequence data from this article can be found in the GenBank/EMBL databases under the following accession number: E-MTAB-375.

### Supplemental Data

The following materials are available in the online version of this article.

**Supplemental Figure 1.**Condition-Specific Profiles of the Metabolic Functions Exhibiting Differential Capacity in at Least One but No More Than Seven of the Investigated Conditions.**Supplemental Data Set 1.**List of 167 Metabolic Functions Investigated.

## AUTHOR CONTRIBUTIONS

N.T. designed research, performed research, contributed new computational methods, and analyzed data. C.C. analyzed data. S.G. contributed new computational methods. L.W. designed research. A.R.F. performed research and analyzed data. Z.N. designed research, performed research, and analyzed data. All authors contributed to writing the article.

## Footnotes

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: Zoran Nikoloski (nikoloski{at}mpimp-golm.mpg.de).

↵[C] Some figures in this article are displayed in color online but in black and white in the print edition.

↵[W] Online version contains Web-only data.

### Glossary

- LP
- linear program
- NO
- nitric oxide
- ROS
- reactive oxygen species
- FBA
- flux balance analysis

- Received December 22, 2012.
- Revised March 18, 2013.
- Accepted April 5, 2013.
- Published April 23, 2013.