First published online October 24, 2002; 10.1105/tpc.005009
The Plant Cell, Vol. 14, 2863-2882,
November 2002, Copyright © 2002,
American Society of Plant Biologists
Redundant Proteolytic Mechanisms Process Seed Storage Proteins in the Absence of Seed-Type Members of the Vacuolar Processing Enzyme Family of Cysteine Proteases
Darren (Fred) Gruis,
David A. Selinger,
Jill M. Curran and
Rudolf Jung1
Pioneer Hi-Bred International, a DuPont Company, 7300 NW 62nd Avenue, Johnston, Iowa 50131-1004
1 To whom correspondence should be addressed. E-mail rudolf.jung{at}pioneer.com; fax 515-254-2619
 |
Abstract
|
|---|
Seed-type vacuolar processing enzyme (VPE) activity is predicted to be essential for post-translational proteolysis of seed storage proteins in the protein storage vacuole of developing seeds. To test this hypothesis, we examined the protein profiles of developing and germinating seeds from Arabidopsis plants containing transposon-insertional knockout mutations in the genes that encode the two seed-type VPEs in Arabidopsis, VPE, which was identified previously, and VPE, which is described here. The effects of these mutations were studied individually in single mutants and together in a double mutant. Surprisingly, we found that most of the seed protein still was processed proteolytically in seed-type VPE mutants. The minor differences observed in polypeptide accumulation between wild-type and VPE mutant seeds were characterized using a two-dimensional gel/mass spectrometric analysis approach. The results showed increased amounts of propolypeptide forms of legumin-type globulins accumulating in mutant seeds. However, the majority of protein (>80%) still was processed to mature - and -chains, as observed in wild-type seeds. Furthermore, we identified several legumin-type globulin polypeptides, not corresponding to pro or mature forms, that increased in accumulation in VPE mutant seeds compared with wild-type seeds. Together, these results indicate the existence of both redundant and alternative processing activities in seeds. The latter was substantiated by N-terminal sequencing of a napin-type albumin protein, indicating cleavage consistent with previous in vitro studies using purified aspartic protease. Analysis of genome-wide transcript profiling data sets identified six protease genes (including an aspartic protease gene and VPE) that shared spatial and temporal expression patterns with seed storage proteins. From these results, we conclude that seed-type VPEs constitute merely one pathway for processing seed storage protein and that other proteolytic enzymes also can process storage proteins into chains capable of stable accumulation in mature seeds.
 |
INTRODUCTION
|
|---|
Seed germination is a heterotrophic stage in the life cycle of plants during which the emerging seedling relies on stored materials for continued growth and development. One of these essential materials is reduced nitrogen, which is accumulated predominantly in the form of seed storage proteins. The process of storage protein deposition in maturing seeds and mobilization in germinating seeds appears to be highly specialized, involving dedicated compartments termed protein storage vacuoles (PSVs). Only a few proteins have evolved to survive the lytic environment of the PSV (which probably is the case with proteins that accumulate in any type of plant vacuole). This imposes restrictions on the use of vacuoles for the deposition of recombinant protein. Foreign (nonstorage) proteins and genetically modified storage proteins tend to be proteolytically unstable in vacuoles and consequently often fail to accumulate or are fragmented when expressed in plants (Hoffman et al., 1988 ; Jung et al., 1993 ; Kermode et al., 1995 ; Pueyo et al., 1995 ; Jung et al., 1998 ; Frigerio et al., 2000 ).
Arabidopsis seeds contain two predominant classes of seed storage protein: legumin-type globulins (also referred to as 12S globulin or cruciferin in Arabidopsis) (Sjodahl et al., 1991 ), and napin-type albumins (also referred to as 2S albumins or arabidin in Arabidopsis) (Krebbers et al., 1988 ; van der Klei et al., 1993 ). Characterization of these protein types in various dicotyledonous plants has shown that propolypeptides are targeted from the endoplasmic reticulum lumen to the PSV, where they are processed by vacuolar proteases into specific chains. Prolegumin-type globulins are cleaved at a conserved Asn-Gly peptide bond, converting the pro-form into two disulfide-linked mature polypeptides referred to as - and -chain (Muntz, 1998 ). Pronapin-type albumin proteolytic processing appears to be more complex, requiring removal of three propeptide regionsN-terminal processed fragment, internal processed fragment, and C-terminal processed fragmentto obtain the two di-sulfide-linked mature polypeptides referred to as large and small chains (Krebbers et al., 1988 ). Several of the napin-type albumin processing steps involve polypeptide cleavage at conserved Asn residues in the P1 position. Although the functional role of processing in the accumulation of napin-type albumin in mature seeds is not understood, studies of Vicia faba bean legumin have demonstrated postendoplasmic reticulum processing to be essential for trimers of prolegumin to obtain the final higher order molecular forms (hexamers) found in mature seeds (Jung et al., 1998 ).
Work to isolate an asparaginyl-specific endopeptidase responsible for processing storage proteins from maturing seeds of dicots (Hara-Nishimura et al., 1991 , 1993 ) resulted in the identification of a novel class of Cys proteases in plants that are related to a hemoglobinase from Schistosoma mansoni (C13; EC 3.4.22.34). Because of its specificity for Asn in the P1 position of the proteolytic cleavage site and the apparent property of processing prolegumin in vitro, it was called asparaginyl endopeptidase and legumain (Ishii, 1994 ; Hara-Nishimura, 1998 ), respectively, or more commonly vacuolar processing enzyme (VPE) (Hara-Nishimura et al., 1993 ). Further characterization of members of the VPE family has shown VPE to be localized to the vacuolar matrix (Hara-Nishimura et al., 1993 ), self-catalytically activated (Kuroyanagi et al., 2002 ), and capable of cleaving both legumin-type and napin-type storage proteins in vitro (Hara-Nishimura et al., 1991 ; Shimada et al., 1994 ; Hiraiwa et al., 1997 ; Yamada et al., 1999 ), all of which are hallmarks of a seed protein maturase. Because this was a clearly defined subfamily of proteases and because storage protein accumulation in dicot seed PSVs is highly specialized, it was hypothesized and widely accepted that VPE most likely was responsible for the specific polypeptide processing events of seed storage protein in the PSV. However, members of the VPE family also have been identified in a variety of other tissues throughout the plant, including leaves, roots, nucellar cell walls, and cotyledons of germinating seedlings (Becker et al., 1995 ; Kinoshita et al., 1995b ; Linnestad et al., 1998 ; Fischer et al., 2000 ; Hayashi et al., 2001 ). These VPE family members have been associated with functions other than seed storage protein processing, including tissue senescence and storage protein breakdown during germination, although these functions have not been demonstrated in vivo.
Phylogenetic examination of the VPE family identified two distinct subfamilies of VPEs (Kinoshita et al., 1995a ). VPEs associated with seed protein maturation constitute one subfamily (referred to as seed-type VPEs), and VPEs associated with processes other than seed protein maturation constitute the second subfamily (referred to as vegetative-type VPEs). In Arabidopsis, the VPE family has been described as having three members ( VPE, VPE, and VPE) (Kinoshita et al., 1995a ). Using promoter -glucuronidase fusion constructs, VPE was demonstrated to be expressed in seeds, whereas the expression of VPE and VPE appeared to be limited to vegetative tissuesroot and leaf, respectively. These expression patterns are in accord with the phylogenetic grouping of these genes, identifying VPE as a member of the seed-type VPE subfamily and VPE and VPE as members of the vegetative-type VPE subfamily (Kinoshita et al., 1999 ).
Confirmation in planta of seed- or vegetative-type VPE function has not been established. Muntz and Shutov (2002) attempted to suppress seed-type VPEs by expressing specific antisense DNA in transgenic tobacco seeds, but they detected no clear processing phenotype. However, the complexity of the VPE family in the tobacco genome is unknown; therefore, undetected, functionally redundant, seed-type VPEs could account for the result. An example of apparent redundancy is provided in this report, in which our examination of the Arabidopsis genome identified a fourth, seed-expressed, VPE family member. Another explanation might involve functionally redundant proteolytic enzymes other than VPE homologs. Although disputed in the literature, support for functionally redundant proteolytic enzymes (other than VPE homologs) has been shown in soybean, from which an activity capable of processing legumin was isolated from seeds. This proteolytic activity was associated with a protein of a markedly different molecular mass than VPEs (Scott et al., 1992 ). Additionally, aspartic protease activity purified from developing Brassica seeds is capable of cleaving Arabidopsis napin-type albumin propeptide fragments in vitro (D'Hondt et al., 1993 ), and also has been implicated in the processing of prolectin in barley seeds (Runeberg-Roos et al., 1994 ). Others have argued that aspartic protease activity may play a role only in trimming C-terminal propeptides (Hiraiwa et al., 1997 ).
Using the Arabidopsis model, for which a complete genomic sequence and several gene knockout populations are available, we attempted to confirm the specific role of seed-type VPEs in vivo. Here, we provide evidence that seed-type VPEs do not constitute an exclusive proteolytic activity for seed storage protein maturation. Furthermore, we identify other proteases potentially involved in the processing of storage protein into chains competent for stable accumulation in mature seeds.
 |
RESULTS
|
|---|
Identification of VPE, a Novel Seed-Type VPE Family Member of Arabidopsis
Sequences of seed-type and vegetative-type members of the VPE gene family were used to query the Arabidopsis genomic and EST databases as well as the DuPont-Pioneer Arabidopsis EST database to identify all members of the VPE gene family in Arabidopsis. In addition to the three previously described VPE genes (Kinoshita et al., 1995a ), this search located a fourth VPE family member on chromosome III, to which we assigned the name VPE (Figure 1A) . Several VPE cDNAs, identified in EST libraries derived from green silique tissue, were sequenced. The deduced polypeptide contains a putative N-terminal signal peptide and active site residues characteristic of VPE. VPE is 48% identical to VPE and 50% identical to VPE. It is 22% identical to putative Arabidopsis gpi-8, a member of the GPI-anchor transamidase family, a protein family that appears to be evolutionarily related to VPEs (Benghezal et al., 1996 ). Examination of the phylogenetic relationship of the VPE protein family indicated that VPE could not be assigned to either the seed- or the vegetative-type VPE subfamily but represents a novel branch of the VPE family in plants (Figure 1B).

View larger version (21K):
[in this window]
[in a new window]
|
Figure 1. Chromosomal Location, Phylogenetic Relationship, and Expression of Arabidopsis Seed-Type VPE Genes.
(A) Chromosomal location of each VPE gene and GPI-8. cM, centimorgan.
(B) Dendrogram of VPE protein sequences. The lightly shaded box highlights the vegetative-type (V-type) VPE gene branch, and the darkly shaded box highlights the seed-type (S-type) VPE gene branch. Bootstrapping values of all internal branches exceeded 95% unless indicated otherwise. Accession numbers for non-Arabidopsis proteins are given.
(C) Semiquantitative multiplexed RT-PCR showing the expression of VPE ( ) and cytosolic ribosomal protein S11 (R) as an internal standard in seeds at 14 DAA (14), seeds at 7 DAA (7), flower (F), leaf (L), stem (S), and root (R) of wild-type plants.
(D) Semiquantitative multiplexed RT-PCR showing the expression of VPE ( ), VPE ( ), and cytosolic ribosomal protein S11 (R) in developing seeds at 6 DAA (6), 9 DAA (9), 12 DAA (12), and 15 DAA (15) and mature seeds (M).
|
|
To determine if VPE is a seed-type VPE gene, expression was examined using semiquantitative reverse transcriptase PCR (RT-PCR). As shown in Figure 1C, VPE expression was detected primarily in developing seeds at 7 days after anthesis (DAA). A weaker signal was detected in flowers but not in the vegetative tissues examined. A more detailed examination of expression during seed development showed overlapping expression of VPE (Figure 1D, right) with VPE (Figure 1D, left), although peak steady state expression levels of VPE occur earlier, during the cell division phase of seed development (Meinke, 1994 ). Notwithstanding the phylogenetic relationship, VPE expression, occurring primarily in developing seeds, allows for the conclusion that VPE is a seed-type VPE. Furthermore, because its expression overlaps with that of VPE and because the half-life of VPE is unknown, we concluded that it has the potential to act as a VPE redundant seed proteinprocessing enzyme.
Identification of dSpm Transposon Insertion Events in VPE and VPE
Arabidopsis plants devoid of functional VPE and VPE were isolated by identifying plants from the Sainsbury Laboratory dSpm mutant collection containing dSpm transposon insertions in the coding sequences of the corresponding genes (Tissier et al., 1999 ). A putative insertion allele of VPE ( VPE::dSpm1) was identified by querying the SINS database of the Sainsbury Laboratory (http://www.jic.bbsrc.ac.uk/Sainsbury-lab/jonathan-jones/SINS-database/database.html). The allele was predicted to be within mutant plant pool 1.14, which we confirmed by PCR screening and sequencing of DNA derived from pool 1.14. We found the dSpm element to be within the beginning of the second exon of VPE. This insertion disrupts the coding sequence near the N terminus of VPE, which was expected to cause the complete inactivation of gene function. Plants homozygous for the VPE::dSpm1 allele were identified using allele-specific PCR by the presence of the VPE::dSpm1 allele and the absence of the wild-type VPE allele. DNA gel blot analysis using sequence of the dSpm element as a probe indicated that VPE::dSpm1 plants have a single dSpm insertion at one locus within their genome (data not shown).
Two independent insertion alleles of VPE ( VPE::dSpm1 and VPE::dSpm2) were identified in DNA isolated from mutant plant pools 5.41 and 1.24, respectively, by reverse screening using SLAT (see Methods) blots probed with VPE (Figure 2A)
. DNA sequences flanking the VPE insertion sites were cloned and sequenced to determine the location of the dSpm elements within VPE (Figure 2B). The allele VPE::dSpm1 (pool 5.41) contains a dSpm element in the first exon, whereas the allele VPE::dSpm2 (pool 1.24) contains a dSpm element in the third intron. The VPE:: dSpm1 allele was selected for functional studies because of the higher probability for gene inactivation by exon transposon insertion. Plants homozygous for the VPE::dSpm1 allele were isolated as described for VPE::dSpm1.
To obtain double-mutant plants devoid of seed-type VPE activity, plants homozygous for VPE::dSpm1 and plants homozygous for VPE::dSpm1 were crossed. Homozygous double-mutant ( VPE/ VPE) plants were identified by PCR in F2 progeny after F1 self-pollination.
Homozygous mutants of VPE and VPE as well as double-homozygous VPE/ VPE mutant plants were examined for visible phenotypes under normal growth conditions. In all cases, no effects were observed on germination rate, vegetative growth, seed set, or macroscopic seed morphology (data not shown).
Confirmation of Gene Knockout by RT-PCR and Protein Immunoblot Analysis
Multiplexed RT-PCR was performed using VPE or VPE gene-specific primers downstream of the dSpm insertion site in combination with primers specific for a control transcript (cytosolic ribosomal protein S11). Primers were designed to flank intron segments such that genomic DNA contamination would be recognized as a significant shift in size of the PCR product. As shown in Figure 3A , fragments specific for the cytosolic ribosomal protein S11 transcript were produced in all multiplexed RT-PCR procedures independent of the genotype. By contrast, VPE- or VPE-specific RT-PCR fragments of the expected sizes were amplified from developing wild-type seeds but not from extracts of developing seeds from VPE::dSpm1 or VPE:: dSpm1 homozygous plants. The lack of detectable VPE-specific fragments even after 45 cycles of amplification was a strong indicator of the absence of functional VPE transcripts.
Figure 3B shows an immunoblot analysis of seed extracts from mature seeds of homozygous VPE::dSpm1 mutants and wild-type siblings probed with a monospecific VPE antibody. Two immunoreactive bands, one minor, with an apparent molecular mass of 50 kD, and one major, with an apparent molecular mass of 34 kD, were observed only in wild-type controls and not in homozygous VPE::dSpm1 plants. These bands are consistent with the reported sizes of VPE propolypeptides and mature VPE, respectively, from other plant species (Hara-Nishimura et al., 1991 ; Shimada et al., 1994 ). Together with the RT-PCR data, the immunoblot results strongly support the conclusion that the dSpm insertions in the VPE genes result in an effective knockout of the expression of these genes in seeds of the corresponding homozygous mutant plants.
One-Dimensional Gel Analysis of Polypeptide Content in Seeds during Development and Germination
The protein content of mature and germinating seeds was examined using Tris-Tricine SDS-PAGE. Profiles of seed proteins extracted from VPE::dSpm1 and the wild type are presented in Figure 4
. Compared with the wild type, distinct (but minor) alterations in the protein profile were detected in VPE::dSpm1 seeds, and these changes were observed throughout storage protein accumulation (Figure 4B). The most obvious change related to polypeptides of approximate apparent molecular masses of 5, 13, 33, and 47 to 50 kD, which appear slightly increased, and a polypeptide of approximate apparent molecular mass of 32 kD, which appears slightly decreased (Figures 4A and 4B). This subtle protein phenotype segregated 1:3 in F2 progeny after F1 self-pollination of a cross of VPE::dSpm1 homozygous plants to the wild type, consistent with a recessive single-gene mutation (data not shown).

View larger version (35K):
[in this window]
[in a new window]
|
Figure 4. One-Dimensional SDS-PAGE Analysis of Seed Protein Content during Development and Germination.
(A) Mature or germinating seed (24 h) protein from the wild type (C) or homozygous VPE::dSpm1 ( ), homozygous VPE::dSpm1 ( ), or homozygous VPE/ VPE double mutants ( / ). Arrowheads indicate observed differences. The approximate gel locations of legumin-type proglobulins (pro), - and -chains, and napin-type large and small chains are shown with brackets.
(B) Seed protein extracted from the wild type (C) or homozygous VPE::dSpm1 ( ) at 6, 9, 12, and 15 DAA and maturity (M). Arrowheads indicate observed differences.
(C) Protein extracted from germinating wild-type (C) or homozygous VPE/ VPE double-mutant ( / ) seeds. After cold treatment at 4°C for 48 h, seeds were incubated at 25°C for 0, 24, and 48 h.
|
|
Protein profiles of developing and mature seeds from VPE::dSpm1 and the wild type also were compared. However, in contrast to VPE::dSpm1 seeds, no differences between mutant and wild-type seeds were detected (only mature seeds are shown; Figure 4A). This result suggested no unique role for VPE in seed protein accumulation.
Given the overlapping gene expression patterns of VPE and VPE, it appeared possible that the genes could compensate partially or completely for each other's function during seed development. However, seed protein profiles of the VPE/ VPE double mutant appeared identical to profiles of the VPE::dSpm1 mutant alone (Figure 4A). The identical protein profiles suggest that VPE processing activity, if any, and VPE processing activity are not redundant (or additive) to each other.
One seed-type VPE, VsPB2, which is stored in protein bodies of embryonic axes and cotyledons of Vicia sativa, has been implicated in the mobilization of protein reserves during germination (Schlereth et al., 2001 ). Therefore, we compared seed protein profiles of germinating VPE/ VPE double-mutant seeds with those of the wild type (Figure 4C). We found that protein degradation appeared to progress similarly in germinating wild-type and mutant seeds. Hence, our data do not support a unique role for VPE or VPE in protein degradation during germination.
Comparative Two-Dimensional Gel/Mass Spectrometric Proteome Analysis of VPE::dSpm1 and Wild-Type Seed Protein
One-dimensional gel analysis of seed proteins indicated that removal of seed-type VPE activity did not result in major alterations in the accumulation of the seed protein. At least two distinct hypotheses could account for this result. The first hypothesis is that VPE processes only a small, specific subset of seed protein species. The second hypothesis is that VPE normally processes most seed storage protein species; however, other enzyme activities also are capable of compensating nearly entirely for this function. To address this issue, a comparative two-dimensional gel/mass spectrometric proteomic analysis, capable of specifically quantifying and identifying individual polypeptides, was performed using total protein extracts from mature VPE::dSpm1 seeds and wild-type seeds. Seeds from VPE::dSpm1 plants were not analyzed because no differences in protein pattern were observed in one-dimensional gel analysis of these seeds.
Representative gel images shown in Figure 5
illustrate several differences in seed protein content. The proteins were visualized using a fluorescent dye that binds noncovalently to the SDS moiety attached to each polypeptide (Page et al., 1999 ). This permitted quantification of protein spots (features) in a wide dynamic range. Based on the cumulative fluorescence intensity of all features, individual feature quantities were calculated as percentage values of the total protein quantity. The gel separation of each sample was replicated three times, enabling quantitative evaluation of digitally captured gel images. This analysis was performed using ROSETTA software (see Methods) and resulted in the detection of 1364 unique reproducible features at a level of >1 ng of protein. Eighty-four of these features were altered consistently between the mutant and wild-type samples by more than twofold (Figure 5C), illustrating the increased sampling depth of the two-dimensional gel analysis compared with the one-dimensional gel analysis.

View larger version (30K):
[in this window]
[in a new window]
|
Figure 6. Mass Spectrometric Identification of Protein Features.
Composite gel images showing MS-identified features with either a significant change in accumulation (A) or no significant change in accumulation (B). Labels indicate apparent molecular mass and pI of each feature and refer to results detailed in Tables 1 and 2. Arrowheads with asterisks are provided as an aid for comparison with Figure 5. Features identified as legumin-type globulin-related polypeptides are color coded as follows: red (PID 1628583), yellow (PID 9759513), green (PID 4204298), blue (PID 4204299), and black (PID 9279583). All other features are shown in gray.
|
|
Mass spectrometry (MS) identification of two groups of features was attempted. One group was composed of all differentially accumulating features (84 protein spots) detected (Figures 5C and 6A
, Table 1). A second group was composed of a selection of abundant (>0.01% of total detected protein), nondifferentially accumulating (<2-fold change between samples) features (73 protein spots) (Figure 6B, Table 2) designed to identify storage protein accumulation unaffected by VPE knockout. A total of 85 protein spots were identified by MS, of which 34 were accumulated differentially and 51 were not changed significantly in accumulation between mutant and wild-type controls (Figure 6, Tables 1 and 2).
View this table:
[in this window]
[in a new window]
|
Table 1. Mass Spectrometric Identification of Seed Protein Features That Were Changed in Accumulation between Homozygous VPE::dSpm1 and Wild-Type Seeds
|
|
View this table:
[in this window]
[in a new window]
|
Table 2. Mass Spectrometric Identification of Select Seed Protein Features That Did Not Change in Accumulation Between Homozygous VPE::dSpm1 and Wild-Type Seeds
|
|
The data shown in Tables 1 and 2 support the concept of VPE being involved specifically with storage protein maturation in seeds. Almost all of the differences (33 of 34) were identified as seed storage proteins (legumin type, napin type, or vicilin type), whereas despite the identification of several nonstorage proteins (17 of 85 in the project), only one nonstorage protein was changed significantly in quantity (relative to total protein) in the mutant (Table1, feature 11_5.2). The list of identified differences (Table 1) also supports VPE involvement in the deposition of several distinct seed storage proteins. Differentially deposited polypeptide forms of four Arabidopsis legumin-type globulin proteins, a vicilin-type protein, and one napin-type albumin protein were identified. Note that this analysis did not capture polypeptides of approximate apparent molecular mass of <10 kD, complicating the interpretation of changes of low molecular mass storage proteins (e.g., napin-type albumins). To assess net changes in seed storage protein accumulation between VPE mutants and wild-type controls, we considered mostly legumin-type globulins.
Predicted apparent molecular mass and pI (supported by direct MS identification data) were used to identify gel regions corresponding to pro and processed forms of legumin-type globulins. The circles in Figure 5 indicate a gel region of apparent molecular mass and pI predicted to correspond to propolypeptide forms of legumin-type globulin storage proteins. Indeed, all of the differentially accumulating polypeptides in this region were identified by MS as either legumin-type globulins (all four proteins) or a vicilin-like protein (Table 1, features 52_6.9 to 45_6.7, Figure 6A), indicating that knockout of VPE results in the accumulation of storage protein precursors. The sum of the protein detected in this region amounted to 3.7% ± 1.2% and 0.7% ± 0.05% of the total protein quantity detected in the mature seeds of VPE mutants and wild-type controls, respectively. Therefore, this area of the gel shows a threefold to sevenfold increase in detected prolegumin-type globulin protein in VPE mutants compared with the wild type. Similarly, the rectangles in Figure 5 indicate regions of the gel corresponding to the predicted apparent molecular mass and pI of the - and -chains of legumin-type globulin proteins. As expected, MS identified many of the predominant proteins in these gel regions as - and -chains (Tables 1 and 2, Figure 6). The sum of the protein quantities detected in the -chain region averaged 36.6% ± 3.8% and 42.8% ± 3.4% for VPE mutant and wild-type controls, respectively. Therefore, our results showed a trend for the reduction of -chain accumulation in VPE mutant seeds compared with wild-type seeds. The sum of the amount of protein detected in the -chain region averaged 29.3% ± 3.6% and 29.5% ± 0.6% for the VPE mutant and the wild-type control, respectively, indicating no significant change in -chain accumulation as defined by this type of analysis.
In addition to the propolypeptide forms of legumin-type globulin, we detected several polypeptides of apparent molecular mass and pI not consistent with either pro-forms or mature - and -chains (Figure 6, Tables 1 and 2). This point is illustrated by examination of all of the polypeptides derived from a single legumin-type globulin protein (GenBank protein identification number [PID] 1628583). Twenty-seven gel features have been identified as containing derivatives of this protein (Tables 1 and 2). Twenty-three of these 27 gel features were identified as containing exclusively derivatives of this specific protein (the remaining 4 were identified as mixed features or spotsi.e., containing more than one polypeptide type). Each of these 23 gel features was detected in both VPE::dSpm1 and wild-type seeds, indicating that their presence was not unique to either background. One feature was identified by apparent molecular mass and abundance (>2% of total protein detected) as a mature -chain (Table 2, feature 31_6.6), a second feature was identified as containing mature -chain (Table 2, feature 19_9.3), and three features could be accounted for by apparent molecular mass as corresponding to propolypeptide forms (Table 1, features 48_7.8, 48_7.6, and 46_6.9). Of the remaining 18 features, 8 were altered significantly in accumulation (4 decreasing and 4 increasing in response to VPE knockout; Table 1) and 10 were not changed significantly in accumulation (Table 2). Assuming equal distribution of the polypeptide quantity of the -chain gel feature (Table 2, feature 19_9.3), these 18 features accounted for approximately one-fourth of the total quantity of protein identified as PID 1628583 in wild-type seeds. Therefore, most of the protein (75%) could be assigned to the expected pro- and mature forms; however, the remainder appeared to be processed alternatively.
Although the specific post-translational modifications resulting in the observed shifts of apparent molecular mass and pI were not determined, it was observed that several of these polypeptides were reduced in apparent molecular mass compared with the highly abundant mature - and -chains. We interpreted this information as indicative of post-translational proteolysis and sought to compare wild-type and VPE mutant seeds. MS coverage data (location of mass matches of the individual polypeptide with respect to the conserved P1 Asn residue of the full-length protein), apparent molecular mass, and polypeptide quantity were used to define legumin-type globulin polypeptides as pro-forms, wild-type mature - or -forms, or alternatively processed forms. Alternatively processed forms were defined for each legumin-type globulin protein as having MS coverage data on both sides of the conserved P1 Asn but smaller in apparent molecular mass (>5-kD difference) than the precursor pro-form or having MS coverage data identifying a polypeptide as corresponding to either the - or -form but smaller in apparent molecular mass ( 0.5-kD difference) than the respective highly abundant mature form (>2% of total protein detected). Using the entire data set, we found that the MS-identified alternative polypeptide processing derivatives constituted 11.8% ± 1.4% and 7.4% ± 0.8% of the total detected protein in VPE::dSpm1 seeds and wild-type seeds, respectively. The quantities of the remaining mature polypeptide forms of the legumin-type globulins identified by MS were 28.6% ± 2.1% and 35.0% ± 0.8% of the total protein detected in VPE::dSpm1 seeds and wild-type seeds, respectively. Therefore, a significant shift from wild-type processed forms of legumin-type globulins to alternatively processed forms occurred in response to the removal of VPE activity in seeds.
N-Terminal Amino Acid Sequence Analysis
Although it is a powerful technology for protein identification, MS is less suited for the determination of N-terminal amino acid sequences. Therefore, Edman degradation (Matsudaira et al., 1993 ) was performed with two prominent polypeptides accumulated in VPE::dSpm1 seeds (Figure 7)
. The larger polypeptide ( 50 kD) was identified as a legumin-type globulin (PID 4204298) with an N terminus corresponding to amino acids immediately downstream of the predicted signal sequence (Figure 7). Sequence and molecular mass identify this polypeptide as an unprocessed legumin-type proglobulin precursor. It appears to be identical to the two-dimensional gel feature 46_5.4 (Table 1), identified as the same specific legumin-type globulin protein.
The smaller polypeptide ( 13 kD) was identified as a product of a napin-type albumin (PID 166616) that is not yet processed within the internal processed fragment region to produce the typical large and small albumin subunits found in the wild type. It appears to be identical to two-dimensional gel feature 13_7.9 (Table 1). The N-terminal residues of this polypeptide did not correspond to the residues immediately downstream of the signal peptide sequence, as would be predicted for a propolypeptide precursor; instead, it corresponded to the central portion of the N-terminal processed fragment sequence (Figure 7).
Analysis of Massively Parallel Signature Sequencing Transcript Profiling Data
Results of the analysis of seed-type VPE knockout mutants suggested the existence of both alternative and redundant processing pathways for storage proteins in maturing seeds. Additionally, our results demonstrated that a protease gene ( VPE) with peak levels of transcription occurring during mid seed development (14 DAA) was involved in storage protein processing. The mid seed developmental window also was the stage at which large amounts of storage protein accumulated (Figure 4B). Therefore, to identify other protease genes with expression patterns that correlated with storage protein accumulation, we queried a database of precomputed gene expression clusters (GECs) derived from several Arabidopsis Lynx Therapeutics massively parallel signature sequencing (MPSS) high-resolution gene expression data sets (D.A. Selinger, F. Gruis, and R. Jung, unpublished data). A GEC is a group of genes that share a significantly similar spatial and temporal expression level relationship. Lynx Therapeutics MPSS gene expression data sets are essentially very deep EST sequencing experiments, each of which consists of 1 to 2 million sequences obtained from a single tissue source (Brenner et al., 2000 ). This depth of EST sequencing provides quantitative and comprehensive gene expression data. The GEC database used here included Lynx Therapeutics MPSS data sets from seeds during early development (cell division phase; 7 DAA), seeds during mid- development (storage protein accumulation phase; 14 DAA), germinating seeds (radicle protrusion phase), whole-plant seedlings (stage 1), leaves, whole roots, and inflorescences.
Queries of the GEC database (which contains 128 unique GECs) using conceptual Lynx Therapeutics MPSS ESTs (see Methods) for Arabidopsis napin-type albumins, legumin-type globulins, and vicilin-like gene sequences resulted in the identification of a single GEC (GEC98) that contained ESTs for 10 of the most abundant seed storage protein genes. Table 3 lists the locus names and accession numbers for the seed storage protein genes used to identify GEC98 as the relevant seed storage proteinspecific GEC. Figure 8A
illustrates the relative expression of the storage protein genes identified in GEC98 across all tissue types examined. In addition to the 10 unique ESTs corresponding to storage protein gene transcripts, GEC98 contained 230 other MPSS ESTs. To determine if any protease gene transcripts also were identified with GEC98, conceptual MPSS ESTs were determined for all 127 annotated proteases (classified as members of the Cys, Asp, Ser and metallo-protease gene families) identified in the Arabidopsis genome (NCBI, February 5, 2002) and compared with the 230 MPSS ESTs in GEC98. As expected, the EST corresponding to the VPE transcript (Table 3) was present in GEC98. Also as expected, we observed no MPSS EST corresponding to the VPE transcript in GEC98. This finding is in accord with those of RT-PCR experiments (Figure 1) indicating that VPE had an expression pattern unlike that of VPE (maximum VPE expression before maximum VPE expression in developing seeds). In addition to VPE, we identified five gene transcripts annotated as proteases (Table 3).

View larger version (19K):
[in this window]
[in a new window]
|
Figure 8. Expression Profiles of the Seed Storage Protein and Protease Genes Identified in the Seed ProteinSpecific Expression Cluster (GEC98).
(A) The expression profile of each gene or of a storage protein gene group (average of four genes; see key) is plotted as a percentage of expression relative to maximum observed expression across all tissues sampled. Tissues sampled were inflorescences (F), seeds at 7 DAA (7), seeds at 14 DAA (14), germinating seeds (G), seedlings (S), leaves (L), and roots (R).
(B) Expression profiles plotted as a log scale of transcript levels in parts per million. Tissues and genes are the same as in (A).
|
|
Figure 8B shows MPSS EST levels in parts per million for the seed storage and protease genes identified in the seed storage proteinspecific cluster (GEC98). Three of these protease genes were highly expressed: two belong to the papain-type subfamily of Cys proteases, and the other belongs to the aspartic protease family. A Ser carboxypeptidase homolog and the homolog of soybean metalloendoproteinase I also were found in GEC98, but they were expressed at lower levels based on MPSS EST counts (parts per million). The aspartic protease family member shares 80% identity with an aspartic protease cloned previously from Arabidopsis seeds (D'Hondt et al., 1997 ). The two Cys proteases are not related closely to each other, sharing only 30% identity. One (Cys protease 1) appears to be related more closely to RD21A (43% identity), whereas the other (Cys protease 2) appears to be related more closely to RD19A (51% identity) (Koizumi et al., 1993 ). The closest homolog of Cys protease 2 that we identified was a soybean protein annotated as the "40-kD seed maturation protein" (Nong et al., 1995 ), with which it shares 62% identity.
 |
DISCUSSION
|
|---|
The understanding and control of the cellular mechanisms responsible for protein accumulation and turnover in seeds are of importance to biotechnological efforts to produce either foreign proteins or engineered endogenous proteins in this tissue. It is believed that seed storage protein deposition in seed vacuoles follows a specific path involving restricted proteolytic processing by specialized proteases (Muntz, 1998 ). It has been suggested that proteases involved in seed storage protein processing also may determine the stability of recombinant proteins targeted to PSV in seeds (Jung et al., 1993 , 1998 ). Members of the VPE family of proteases were isolated from seeds and identified as capable of processing legumin-type globulin and napin-type albumin seed storage proteins in dicots on the basis of proteolytic processing assays in vitro (Hara-Nishimura et al., 1993 ). Additionally, seed-type VPEs have been localized to the PSV and shown to be capable of self-catalytic activation in acidic vacuoles (Hara-Nishimura et al., 1993 ; Kuroyanagi et al., 2002 ). Together, this evidence has led to the paradigm that seed-type VPEs are the enzymes responsible for processing seed storage proteins (Hara-Nishimura, 1998 ; Muntz and Shutov, 2002 ). However, to our knowledge, VPE function with regard to seed storage protein processing has never been demonstrated successfully or disproved directly in planta. This may be attributable to functional redundancy provided either from within the VPE gene family or by alternative proteases. To address these challenges, we used a plant model system with a fully sequenced genome (Arabidopsis) to test the validity of the model of VPE function in seeds.
Redundant Proteolytic Mechanisms Compensate Incompletely for the Loss of VPE Expression in Developing Seeds
Arabidopsis was an especially attractive model system because the VPE family in this species is small, with only one seed-type VPE member ( VPE) described previously. Surprisingly, knockout of VPE expression did not abolish seed storage protein processing of either the legumin-type globulins or the napin-type albumins. However, in mutant seeds, we observed a consistent accumulation of small amounts of novel storage protein polypeptide derivatives that apparently were absent or minor components in wild-type seeds. During germination, these novel polypeptides as well as the mature storage protein polypeptide chains appeared to be mobilized with kinetics similar to those of wild-type seeds. These observations support the conclusion that VPE is involved in storage protein accumulation but not in storage protein mobilization. This finding is in agreement with the idea that the enzymes that process storage proteins during seed development are not the same enzymes that degrade them during germination (Muntz, 1996 ). Furthermore, these observations support the conclusion that the same cellular machinery responsible for mobilizing mature wild-type storage protein polypeptides in germinating seeds also is capable of mobilizing the accumulated novel polypeptides in VPE knockout seeds.
A Second Seed-Expressed VPE Gene Identified in Arabidopsis ( VPE) Does Not Contribute Significantly to Seed Storage Protein Processing
We predicted that the redundancy of the VPE family might be the explanation for the normal processing of the majority of storage protein in VPE::dSpm1 seeds. Examination of the genome led to the identification of a second seed-expressed VPE gene ( VPE). Further searches did not identify any additional ESTs or genomic sequence beyond those corresponding to the four VPE genes described, even at homology levels low enough to identify the more distantly related putative gpi-8 gene. Therefore, it is highly unlikely that other expressed VPE genes remain undiscovered in the Arabidopsis genome. Although our results showed that the VPE transcript is expressed highly in developing seeds, its expression pattern differed from that of VPE in that the highest levels of VPE expression are found in seeds before VPE expression. Nonetheless, VPE expression clearly is seed preferred and partially overlaps that of VPE, making it a good candidate for functional redundancy to VPE. However, examination of mature seed protein profiles of VPE knockout mutants and VPE/ VPE double mutants failed to identify any role for VPE in seed storage protein processing. It remains to be determined whether the increased sampling depth of two-dimensional gel analysis could reveal polypeptide changes in seeds of VPE::dSpm1 that were not detected by one-dimensional gel analysis.
Recently, two closely related fruit/seed-expressed VPE family members, one from tomato and one from tobacco, were described (Fischer et al., 2000 ; Muntz and Shutov, 2002 ). As with VPE, it has not been possible to clearly assign these family members to either the seed-type or the vegetative-type phylogenetic group, which suggests the possibility of a third functional VPE subclass. To date, no direct function has been identified for this group of VPE genes, yet it has been proposed that the tobacco VPE may have a functional role in early embryogenesis (Fischer et al., 2000 ; Muntz and Shutov, 2002 ). Here, we present no evidence that VPE knockout results in any embryogenic abnormalities, nor do we present evidence of disadvantageous consequences when VPE knockout plants are propagated under typical laboratory conditions. Therefore, the function of this VPE family member remains unclear.
A second alternative functional role for VPE family members (VsPB2 and proteinase B from V. sativa) relates to storage protein mobilization in germinating seeds (Becker et al., 1995 ; Schlereth et al., 2001 ). However, such a function for VPE appears unlikely, because we did not find a reduction of storage protein mobilization in the VPE/ VPE double null mutant during germination.
Specific and Direct Involvement of VPE in Storage Protein Processing
The unexpected result that deleting VPE in Arabidopsis resulted in only a minor change in seed protein accumulation led us to hypothesize that VPE may act on a specific subset of storage proteins or is active in only a specific vacuolar subcompartment within the cells (Jiang et al., 2001 ). A comparative two-dimensional gel/MS analysis to determine all protein accumulation changes between wild-type seeds and VPE::dSpm1 seeds demonstrated that a wide variety of storage proteins are altered in accumulation (both legumin-type globulins and napin-type albumins), whereas only one nonstorage seed protein was determined to be altered in accumulation. These results support the hypothesis of the specific and direct involvement of VPE in storage protein processing. Additionally, we determined that the absence of VPE significantly increased the accumulation of pro-forms of legumin-type globulins and a napin-type albumin in mature seeds. The increase in pro-forms of legumin-type globulins was accompanied by a trend toward reduced accumulation of mature -chains in the VPE mutant. It is important to note that the amount of protein accumulated as a prolegumin-type globulin is only a fraction ( 5 to 10%) of the total amount of legumin-type globulin in the seed; therefore, it is difficult to detect the corresponding reduction in mature - and -chains in the VPE mutant with a high degree of confidence. Nevertheless, these observations are consistent with the conclusion that VPE does perform a function in processing seed storage protein from pro-forms to mature chains, as predicted previously (Kinoshita et al., 1999 ). However, in contrast to previous predictions, we found that other redundant proteolytic mechanisms are capable of compensating for this function almost completely.
Alternative Proteolytic Processing of Arabidopsis Seed Proteins
In addition to detecting pro-forms of legumin-type globulins and mature legumin-type globulin - and -chains, our comparative proteomic analysis also detected a number of novel polypeptide-processing derivatives of legumin-type globulins. For each legumin-type globulin gene, several polypeptides accumulated in seeds, many of them differing in apparent molecular mass and pI from the predominantly accumulating - and -chains. The decrease in molecular mass of the derivative polypeptides compared with their respective mature - and -chains is evidence that the post-translational modifications involved most likely are the result of alternative or additional proteolytic mechanisms. We observed these legumin-type globulin polypeptide derivatives in both wild-type and VPE::dSpm1 seeds. Individually, these polypeptide derivatives do not appear to account for a large amount of the legumin-type globulin protein from any single gene; however, together, they can constitute as much as one-fourth of the protein from a single gene. We also found that the amounts of a subset of these derivatives, but not all of them, were altered in response to VPE knockout. Although the accumulated quantity of some derivatives decreased and others increased in VPE::dSpm1 seeds, a significant overall increase in this pool of legumin-type globulin was observed in VPE::dSpm1 seeds. We interpret this observation as support for the conclusion that alternative proteolytic processing pathways exist in developing seeds. We hypothesize that more legumin-type globulin protein is processed by these alternative pathways in response to an increase in the amount of pro-forms of storage protein in the PSVs of seeds.
There are few benchmarks in the literature for comparison of our comparative two-dimensional gel/MS analysis of mature Arabidopsis seeds. However, a recent proteomic analysis of Arabidopsis also described alternative forms of legumin-type globulins accumulating in mature wild-type seeds (Gallardo et al., 2001 ). Although the molecular mass and pI associated with protein features (spots) in the two-dimensional separations in that analysis are technically difficult to compare with the features observed in our analysis, a similar origin of these features is likely. Gallardo's group suggested that these polypeptides were products of proteolysis by proteases that are active predominantly during germination. We find this interpretation tenuous for the following reasons. First, the majority of legumin-type globulin polypeptides identified by Gallardo et al. (2001) (18 features) remained constant in relative quantity and did not increase or decrease during germination, as might be expected if they were transient products of legumin-type globulin mobilization. Second, although Gallardo and colleagues observed an increase in quantity of six legumin-type globulin polypeptides during germination, they also observed a decrease in four legumin-type globulin polypeptides corresponding in molecular mass to pro-forms of the protein. Therefore, the increase in processed polypeptide forms might be accounted for by processing of the pro-forms remaining in mature seeds. Finally, in our experiments using one-dimensional gels, we observed no transient accumulation in wild-type germinating seeds of polypeptides similar in molecular mass to those that accumulated in developing VPE::dSpm1 seeds (Figure 4). To the contrary, polypeptides found to accumulate in VPE::dSpm1 seeds diminished during germination at rates similar to mature - and -chains.
Aspartic Protease Activity Likely Is Involved in Napin-Type Albumin Processing
In addition to the comparative two-dimensional gel/MS analysis, Edman degradation sequencing was used to determine the N-terminal amino acids of two polypeptides that accumulated in VPE::dSpm1 seeds. One polypeptide (50 kD) was identified by sequence and apparent molecular mass as a prolegumin-type globulin and provided further supporting evidence that pro-forms can and do accumulate in mature seeds in response to VPE knockout. A second polypeptide (13 kD) was identified by its sequence and apparent molecular mass as a napin-type albumin that was not cleaved in the internal processed fragment region; cleavage at this site produces the typical mature large and small chains. Interestingly, the N-terminal sequence of this napin-type polypeptide was determined to start immediately after a Phe residue in the middle of the N-terminal processed fragment region. This site is downstream of the predicted signal sequence cleavage site and is consistent in sequence context with cleavage by a member of the aspartic protease gene family (D'Hondt et al., 1993 ). D'Hondt et al. (1993) dem-onstrated that aspartic protease activity purified from developing canola seed cleaved a short synthetic polypeptide derived from the middle of the N-terminal processed fragment region in vitro. Here, we report that the cleavage site we observed in planta is the same cleavage site observed previously in the in vitro processing assays, providing evidence of a functional role for aspartic protease activity in seed storage protein processing in planta.
The in planta identification of processed storage proteins with cleavage sites consistent with aspartic protease activity in plants that lack VPE implies a different conclusion than that derived from in vitro processing assays by Hiraiwa et al. (1997) . In that report, the authors discounted a possible primary role of aspartic proteases in the processing of seed storage protein and proposed that VPE activity was the primary or initial activity involved in the processing of storage protein. It was further concluded that aspartic protease activity was capable of performing only a secondary function of trimming the C-terminal propeptides after cleavage by VPE into chains. However, contrary to these conclusions, our results suggest a discrete and independent in vivo role (with no primary VPE activity needed) for aspartic proteases in the processing of napin-type albumins. Additionally, we observed aspartic proteasecleaved napin-type albumin at all investigated stages during storage protein accumulation (Figure 4B), suggesting the colocalization of aspartic protease and napin-type albumins in planta throughout seed development.
Candidate Genes for Redundant and Alternative Seed Protein Processing Enzymes in Arabidopsis
Protease genes that might constitute redundant or alternative proteolytic pathways for seed storage proteins were identified by the fact that they shared a spatial and temporal expression relationship with seed storage protein genes. Queries of a gene expression cluster database derived from several Arabidopsis Lynx Therapeutics MPSS expression profiling data sets (D.A. Selinger, F. Gruis, and R. Jung, unpublished data) identified a single cluster of 240 genes that shared a similar expression profile. In addition to containing transcripts from 10 of the most abundantly expressed storage protein genes, the cluster also contained 6 annotated protease genes. The discovery that one of the six genes was VPE and that a second gene was a highly expressed putative aspartic protease gene supports the validity of this approach. This information enables testing of the hypothesis that the specific aspartic protease gene identified encodes the protease involved in the alternative cleavage of napin-type albumins by virtue of gene suppression in a VPE null background. Of the remaining four genes identified in the analysis, two are highly expressed. One is a papain-type Cys proteinase that has homology with a soybean gene identified in developing seeds during the storage protein deposition stage (Nong et al., 1995 ). The second also is a Cys proteinase with greatest homology with RD21A, a drought-inducible Cys proteinase of Arabidopsis (Koizumi et al., 1993 ). Interestingly, a recent report indicated that RD21A and VPE both were accumulated specifically in "endoplasmic reticulum bodies" in leaf epidermal cells of Arabidopsis (Hayashi et al., 2001 ). Although it is unclear whether these two proteases can act synergistically, it is attractive to hypothesize that their colocalization and coexpression are of functional consequence. The RD21A homolog we identified is coexpressed in a transient pattern similar to VPE; therefore, we predict that it also may have a potential functional synergy. The remaining two genes expressed at a much lower level include a putative metalloprotease and a putative Ser protease that has homology with carboxypeptidases. Activities related to similar genes have been described in developing buckwheat seeds. Although the precise function of these genes is not known, some indicators suggest a role during germination (Dunaevsky et al., 1989 ; Belozersky et al., 1990 ).
In conclusion, our results show that seed-type VPEs are not the only proteases involved in seed storage protein maturation. Furthermore, we provide evidence to support the existence of both redundant and alternative proteolytic pathways in developing Arabidopsis seeds. Finally, we provide a list of potential protease genes that might be involved, with the caveat that the analysis described here may not identify protease genes upregulated specifically in response to VPE knockout. Using the current model system, it is expected that we will be able to delineate the entire proteolytic process of seed storage protein maturation, commencing with the suppression or knockout of these seed-expressed protease genes.
|