Skip to main content

Main menu

  • Home
  • Content
    • Current Issue
    • Archive
    • Preview Papers
  • Info for
    • Instructions for Authors
    • Submit a Manuscript
    • Advertisers
    • Librarians
    • Subscribers
  • About
    • Editorial Board and Staff
    • About the Journal
    • Terms & Privacy
  • More
    • Alerts
    • Contact Us
  • Other Publications
    • Plant Physiology
    • The Plant Cell
    • Plant Direct
    • The Arabidopsis Book
    • Teaching Tools in Plant Biology
    • ASPB
    • Plantae

User menu

  • My alerts
  • Log in

Search

  • Advanced search
Plant Cell
  • Other Publications
    • Plant Physiology
    • The Plant Cell
    • Plant Direct
    • The Arabidopsis Book
    • Teaching Tools in Plant Biology
    • ASPB
    • Plantae
  • My alerts
  • Log in
Plant Cell

Advanced Search

  • Home
  • Content
    • Current Issue
    • Archive
    • Preview Papers
  • Info for
    • Instructions for Authors
    • Submit a Manuscript
    • Advertisers
    • Librarians
    • Subscribers
  • About
    • Editorial Board and Staff
    • About the Journal
    • Terms & Privacy
  • More
    • Alerts
    • Contact Us
  • Follow PlantCell on Twitter
  • Visit PlantCell on Facebook
  • Visit Plantae
Research ArticleResearch Article
You have accessRestricted Access

Cyclic Peptides Arising by Evolutionary Parallelism via Asparaginyl-Endopeptidase–Mediated Biosynthesis

Joshua S. Mylne, Lai Yue Chan, Aurelie H. Chanson, Norelle L. Daly, Hanno Schaefer, Timothy L. Bailey, Philip Nguyencong, Laura Cascales, David J. Craik
Joshua S. Mylne
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lai Yue Chan
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aurelie H. Chanson
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Norelle L. Daly
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hanno Schaefer
Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Timothy L. Bailey
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Philip Nguyencong
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laura Cascales
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David J. Craik
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland 4072, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: d.craik@imb.uq.edu.au

Published July 2012. DOI: https://doi.org/10.1105/tpc.112.099085

  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading
  • © 2012 American Society of Plant Biologists. All rights reserved.

Abstract

The cyclic miniprotein Momordica cochinchinensis Trypsin Inhibitor II (MCoTI-II) (34 amino acids) is a potent trypsin inhibitor (TI) and a favored scaffold for drug design. We have cloned the corresponding genes and determined that each precursor protein contains a tandem series of cyclic TIs terminating with the more commonly known, and potentially ancestral, acyclic TI. Expression of the precursor protein in Arabidopsis thaliana showed that production of the cyclic TIs, but not the terminal acyclic TI, depends on asparaginyl endopeptidase (AEP) for maturation. The nature of their repetitive sequences and the almost identical structures of emerging TIs suggest these cyclic peptides evolved by internal gene amplification associated with recruitment of AEP for processing between domain repeats. This is the third example of similar AEP-mediated processing of a class of cyclic peptides from unrelated precursor proteins in phylogenetically distant plant families. This suggests that production of cyclic peptides in angiosperms has evolved in parallel using AEP as a constraining evolutionary channel. We believe this is evolutionary evidence that, in addition to its known roles in proteolysis, AEP is especially suited to performing protein cyclization.

INTRODUCTION

Novel proteins arise typically by two processes: divergence of gene duplicates and recombination events that alter DNA sequences for existing proteins (Schmidt and Davies, 2007). One recombination-mediated evolutionary event is the internal expansion of genes to create a string of repeated protein domains. Although 10 to 20% of eukaryotic proteins contain domain repeats, their genesis is not well understood (Marcotte et al., 1999; Björklund et al., 2006; Schmidt and Davies, 2007). Nevertheless, several features seem to be shared by repetitive protein domains. Bioinformatic analyses have shown that protein repeats are often short, and after the first repeat is made, the addition of further repeats is more likely (Marcotte et al., 1999). Also, several domains can duplicate in any one instance, and protein expansion with repeating domains is believed to occur from within the middle of a series of repeats (Björklund et al., 2006).

Knottins are a class of peptides containing a disulfide bond knot that has two adjacent disulfide bonds threaded by a third (Chiche et al., 2004). This knot motif, also referred to as an inhibitor cystine knot, is common in a range of unrelated proteins, including protease inhibitors from plants, animal toxins, antimicrobial peptides, as well as some examples from signaling peptides, such as the agouti peptides (Craik et al., 2001). Many knottins have been isolated from seeds of the angiosperm family Cucurbitaceae (squash family), but only two genes for their precursor proteins have been described in the literature (Ling et al., 1993). These are short and encode an endoplasmic reticulum (ER) signal, a small prodomain, and end with the mature peptide domain (Ling et al., 1993).

The seeds of Momordica cochinchinensis (Cucurbitaceae), a tropical liana also called spiny bitter gourd or gac, contain a typical knottin called Momordica cochinchinensis Trypsin Inhibitor III (MCoTI-III) as well as two unusual knottins, MCoTI-I and MCoTI-II, that are macrocyclic and therefore lack carboxyl and amino termini (Hernandez et al., 2000; Felizmenio-Quimio et al., 2001; Heitz et al., 2001). MCoTI-II has been heavily studied; it is a potent trypsin inhibitor (TI) (Avrutina et al., 2005), is exceptionally stable in plasma assays (Avrutina et al., 2005), is capable of penetrating cells (Greenwood et al., 2007), is structurally able to tolerate substitutions and additions to its loops (Craik et al., 2010), and also can be produced using inteins in Escherichia coli (Camarero et al., 2007) or chemoenzymatically using trypsin columns (Thongyoo et al., 2007). These properties make MCoTI-II ideal as a scaffold that can be used to stabilize peptide drugs. It is also an excellent starting point for the design of novel protease inhibitors (Thongyoo et al., 2009).

Despite having quite different amino acid sequences, the knotted cyclic structure of MCoTI-I and MCoTI-II has caused them to be grouped with members of a large class of plant cyclic peptides called cyclotides (Craik et al., 1999; Göransson et al., 1999) found in the violet (Violaceae), coffee (Rubiaceae), bean (Fabaceae), and petunia (Solanaceae) families. The cyclotide founding member was kalata B1 (Gran, 1970), and these kalata-type cyclic peptides typically comprise 28 to 37 amino acids, have three disulfide bonds, and are typically encoded by dedicated precursor proteins that have an ER signal, a prodomain, one to three mature peptide domains, and end with a hydrophobic tail (Jennings et al., 2001; Dutton et al., 2004; Nguyen et al., 2011; Poth et al., 2011).

MCoTI-I and MCoTI-II also have some similarities to the PawS-derived cyclic peptides of sunflower (Helianthus annuus). In sunflowers, there are two unusual preproalbumins, PawS1 and PawS2, that, in addition to producing napin-like seed storage albumin, also release 12 to 14 amino acid cyclic peptides with a single disulfide bond (Mylne et al., 2011).

For their maturation, both the kalata-type and PawS-derived peptide classes require asparaginyl endopeptidase (AEP, also known as vacuolar processing enzyme or legumain), an endoprotease that cleaves on the C-terminal side of Asn, and to a lesser extent Asp (Hara-Nishimura et al., 1991; Hiraiwa et al., 1999). In Arabidopsis thaliana, there are four AEPs (At2g25940, At1g62710, At4g32940, and At3g20210) that are genetically redundant. In the aep quadruple null, the major phenotype is misprocessing of seed storage proteins, a consequence of the failed cleavage at seed storage protein Asn-Pro bonds (Shimada et al., 2003; Gruis et al., 2004). Our studies of sunflower PawS1 processing in an Arabidopsis aep quadruple null showed that AEP is required for proper cleavage at specific Asn as well as Asp residues (Mylne et al., 2011) and that the Asp residue of the peptide within PawS1 can only be ligated to a Gly residue. How MCoTI-II is biosynthesized is unknown, but its similarity to kalata-type cyclic peptides and the presence of an Asp-Gly and an Asn-Gly in its cyclic sequence suggest two possible ligation points if it shares the same AEP-dependent mechanism of maturation.

Here we describe the discovery of three genes from M. cochinchinensis that all encode the cyclic knottin MCoTI-II. Instead of one peptide per precursor protein, as known for previously characterized acyclic knottins, these genes seem to have undergone extensive internal expansion, with the largest gene encoding eight repeating protein units in tandem, consisting of seven cyclic knottins and a terminal acyclic knottin. We purified and sequenced three novel cyclic knottins and two (acyclic) knottins encoded by these precursor genes. The acyclic knottins are N-terminally pyrolated and have a standard carboxylic acid group at the C terminus. The cyclic knottins are backbone cyclic, meaning they have no amino or carboxyl termini. The similarity of the acyclic knottin and cyclic knottin units suggests that these unusual cyclic knottins in M. cochinchinensis evolved by internal expansion from their terminal knottin. It is unusual that cyclic and acyclic topologies of otherwise structurally identical peptides are formed from a single polypeptide precursor; thus, we named the precursor genes Two Inhibitor Peptide TOPologies (TIPTOP).

RESULTS

Cloning of the Concatemeric TIPTOP Genes

To understand the biosynthetic route for cyclic knottins in M. cochinchinensis, we designed degenerate primers to amplify the gene encoding MCoTI-II. Based on initial sequence data, we designed specific primers for 5′ and 3′ RACE that amplified several fragments with shared 5′ and 3′ untranslated region (UTR) sequences. PCR with primers for these regions amplified three products. Each product was a full-length transcript encoding an ER signal sequence, and each ended with an acyclic knottin domain. However, between these two domains was a repeating series of MCoTI-II or similar peptides flanked by prosequences 16 residues in length (Figure 1A; see Supplemental Figure 1 online). We named them TIPTOP (for Two Inhibitor Peptide TOPologies), because they contained TI-like peptides of cyclic and acyclic topologies. PCR with genomic DNA demonstrated that the TIPTOP1 to TIPTOP3 genes lack introns (see Supplemental Figure 2 online). The same primers amplified TIPTOP2 from genomic DNA of Momordica sphaeroidea, predicted by molecular dating analyses to have shared a common ancestor with M. cochinchinensis ∼3.94 million years ago (Schaefer and Renner, 2010).

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

TIPTOP Proteins from M. cochinchinensis.

(A) Schematic of a typical squash TI precursor from the towel gourd (Luffa cylindrica) TGTI-II precursor compared with TIPTOP1-3 from gac (M. cochinchinensis). aa, amino acids.

(B) Predicted sequence of a cyclic knottin domain from TIPTOP3 and its flanks.

(C) Region containing terminal knottin TI-6 from TIPTOP3.

(D) BOXSHADE alignment of six single-unit knottin precursors with TIPTOP1-3. See Methods for full details of the sources for the six single-unit knottin precursor sequences. This alignments shows that all nine predicted proteins share an ER signal sequence (brown, predicted cleavages shown with arrowheads), a conserved prodomain of unknown function (orange) and the terminal knottin domain (green, known cleavages shown with arrowheads).

In TIPTOP proteins, the cyclic knottins are almost identical to the acyclic knottins, but each cyclic knottin is typically flanked by Gly-Gly-Val on its proto–N terminus and Ser-Gly-Ser-Asp on its proto–C terminus (Figures 1B and 1C). This indicates that the cyclization reaction occurs between Gly at the proto–N terminus and Asp at the proto–C terminus, similar to the kalata-type and PawS-derived classes of cyclic peptide. Each cyclic knottin domain is preceded by an Asn residue (Figure 1B); therefore, the cyclic knottins probably use AEP to release both prototermini, as is the case with PawS1 (Mylne et al., 2011).

The TIPTOP proteins were compared with the known knottin precursors (Figure 1D). Apart from their repetitive nature, they are otherwise similar. All share an ER signal and a C-terminal knottin domain as well as a conserved region of unknown function that follows the ER signal (consensus of IELISDG). This suggests that the ancestral TIPTOP protein might have been a single-TI–domain protein that underwent a series of internal gene duplications. Using TIPTOP primers in RACE and with M. cochinchinensis genomic DNA, we could not amplify the gene encoding any single-TI–domain protein similar to those found in other Cucurbitaceae species (see Supplemental Figure 2 online).

Gene Confirmation through Peptide Analysis

To confirm these gene sequences, we examined the peptides deriving from them (Figure 2A). For simplicity, we hereafter use Arabic numerals instead of Roman numerals and drop the MCo prefix (e.g., MCoTI-II becomes TI-2). The number of encoded peptide domains differs for each gene (Figure 1A); TIPTOP1 encodes an array of five peptides, starting with TI-1, followed by three TI-2 peptides, and terminating with acyclic TI-5. TIPTOP2 encodes six peptides, starting with TI-1, three TI-2 units, TI-4, and terminating with acyclic TI-5. TIPTOP3 is the largest of the genes and encodes eight peptides, including TI-8, followed by five TI-2 peptides, TI-7, and terminating with acyclic TI-6.

Figure 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 2.

TIPTOP-Derived Knottins.

(A) Sequence alignment of new TI sequences (TI-4 to TI-8) with known sequences (TI-1 to TI-3). Asterisks indicate an acyclic peptide. TI-1, TI-2, TI-4, TI-7, and TI-8 are backbone cyclic. The disulfide connectivity determined by NMR for TI-2 and TI-5 is shown below the alignment.

(B) LC-MS profile of M. cochinchinensis peptide extract with sequenced knottins marked. The two peaks with asterisks we suspect contain isomers of identical mass to nearby peaks but with isoaspartyl bonds, a feature of these cyclic knottins observed during their initial characterization (Hernandez et al., 2000).

In addition to encoding TI-1 and TI-2, the TIPTOP genes encode three novel cyclic knottins and two novel knottins (Figure 2A). To confirm their presence in vivo, we examined crude seed extract by liquid chromatography–mass spectrometry (LC-MS). Masses that support all five novel peptides were observed (Figure 2B). To detect the terminating knottins TI-5 and TI-6, we had to adjust the mass for an N-terminal pyroglutamic acid also seen previously for TI-3 (Hernandez et al., 2000). The previously reported TI-3 was not present in any of the TIPTOP genes. TI-5 is identical to TI-3 except for Gly-25, which in TI-3 is Glu. We could purify four of the five novel peptides and sequenced each by tandem mass spectrometry (MS/MS) after reduction, alkylation, and digestion with endoproteinase Glu-C, trypsin, or chymotrypsin (see Supplemental Figures 3 to 6 and Supplemental Table 1 online). We confirmed these four to have the expected sequence; TI-7 abundance was too low to be purified. It is worth noting that, for the new cyclic knottins TI-4 and TI-8, we obtained MS/MS fragmentation that crosses the Asp-Gly ligation point (see Supplemental Figures 3D and 6D online). Only subtle amino acid differences differentiate the new knottins from the three previously known (Hernandez et al., 2000).

Structural Analysis of the Knottin TI-5

Structures are available for TI-2 (1IB9, 1HA9), but not its acyclic relatives. The sequences of TI-5 and TI-2 are similar (Figure 3A). We used NMR to determine the three-dimensional structure of TI-5 and found that, like TI-2, TI-5 is characterized by a cystine knot arrangement of the disulfide bonds and a β-sheet motif as the main element of secondary structure (Figure 3B). An analysis of the three-dimensional structures is provided in Supplemental Table 2 online. TI-5 and TI-2 overlay with a root-mean-square deviation of 0.55 Å over the backbone of residues 2 to 30, highlighting their similarity. The N-terminal residues of TI-5 preceding the first Cys are slightly disordered, consistent with the disorder in loop 6 of TI-2 (Felizmenio-Quimio et al., 2001). This structural analysis of TI-5 confirmed that both peptides are almost identical apart from the Ser-Gly-Ser-Asp joining sequence, suggesting that closing the ring does not require structural rearrangement of the residues flanking the termini of the knottin.

Figure 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 3.

Sequence and Structural Alignment of Cyclic and Acyclic Knottins.

(A) Sequence of cyclic TI-2 (magenta) and acyclic TI-5 (green). The ligation point in TI-2 is marked with an arrow. Residues that differ between TI-2 and TI-5 are marked with asterisks. The three disulfide linkages are shown by connecting bars.

(B) Overlay of structural models for TI-2 (magenta, 1HA9) and the newly acquired TI-5 structure (green, 2LJS). Aside from the obvious ligating Ser-Gly-Ser-Asp sequence in TI-2, TI-5 has a root mean square deviation of 0.55 Å over the backbone residues 2 to 30. The ligation point in TI-2 is marked on its structure with an arrow. The N-terminal pyrol ring of TI-5 is displayed (pGlu1).

Repeating Units Are Consistent with Gene Expansion

The most striking feature of the TIPTOP genes is their repetitive structure. Previous work has indicated that internal gene amplification originates from within the middle of repeating arrays (Björklund et al., 2006). This also seems to be the case with TIPTOP genes, which encode repeating TI-2 units in their midsections with variation in cyclic knottin sequences seen at the front and rear of the concatemers. We aligned the repeating domains of all three TIPTOP genes (see Supplemental Figure 7A online) and performed a phylogenetic analysis (see Supplemental Figure 7B and Supplemental Data Set 1 online). This analysis reinforced observations made at the peptide-coding level, with the first and last peptide domain of each gene clustering separately in an unrooted phylogram. The remaining repeat domains were so similar that their phylogenetic relationship could not be resolved (see Supplemental Figure 7B online). The similarity between the first peptide domain of TIPTOP1 to TIPTOP3 and the last peptide domain of TIPTOP1 to TIPTOP3 implies that the three TIPTOP genes are the product of gene duplication after our proposed expansion. We also used nucleotide and protein alignments of the repeating protein domains (see Supplemental Figures 1B and 1C online) to compare neighboring repeats (see Supplemental Figures 7C and 7D online). This approach reinforced that, for the first and last repeat, there is a strong link between genes, whereas for most of the middle domains, they are so similar to each other that it is impossible to establish any relationships.

The alignment of TIPTOP proteins with single-knottin precursors (Figure 1D) identifies the region where TIPTOP proteins diverge sharply in their predicted protein sequence. Specifically, this is between the conserved consensus sequence IELLISDG and the first Cys residue of the terminal knottin domain. The domains encoding cyclic and acyclic knottins are similar, and this is mirrored in the DNA that encodes them. The 82-base DNA sequence preceding the stop codon in each TIPTOP gene shares 86.5% identity with the repeat unit before it. This region encodes 27 amino acids that share 92.6% identity with the cyclic peptide unit before it. This similarity strongly suggests the internal units encoding the cyclic knottins arose from a duplicated segment of DNA encoding the single ancestral knottin. For this to be the case, in addition to duplication, the duplicated segment would have had to acquire subsequent deletion or frame shifting of the DNA sequence encoding GVYDEKQRA of the knottin as well as addition of a sequence that encodes SGSD-ALEG at the C terminus of each cyclic knottin. Therefore, there is no single, simple duplication and adjacent placement that can be proposed to explain the appearance of cyclic knottins. The simplest scenario is that a DNA segment encoding the terminal knottin was tandemly duplicated and subjected to rearrangements and sequence additions around the flanks, and then this first cyclic knottin underwent additional internal duplication events.

TIPTOP Repeat Units Contain Low Folding Free Energy Sequences

Several repeat proteins from other species have been found to contain palindromic elements (Ogata et al., 2000; Claverie and Ogata, 2003). To ascertain whether TIPTOP repeats contained palindromes, we analyzed the DNA sequence of TIPTOP2 using MEME (Bailey and Elkan, 1994), a program designed to detect repetitive DNA motifs and palindromes. Querying MEME to detect palindromes using default settings, we found that the top-scoring palindrome was a series of related 50-mer repeats within all the cyclic knottin domains as well as the terminal acyclic knottin domain (Figure 4A). The 50-mers are each imperfect palindromes that encode four Cys residues from loop 2 until the sixth Cys (Figures 4B and 4C). To establish whether these imperfect palindromes were statistically significant, we generated a histogram of folding free energies from randomly chosen 50-mer open reading frame (ORF) segments of Arabidopsis (see Supplemental Figure 8B online). We compared the folding free energy of the TIPTOP2 50-mers to the histogram (see Supplemental Figure 8C online). Although the folding free energies were seemingly low, we forced MEME to find palindromes that have accordingly low folding free energies, so when the P values were adjusted for multiple testing, we found that the folding free energies of these 50-mers were not statistically significant.

Figure 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 4.

DNA Repeat Analysis Using TIPTOP2 Reveals Imperfect Palindromes and Significant Low Energy Folding Structures.

(A) Reconstruction of the MEME raw output, showing the location of 50 base repeats found on the sense (+) and minus (−) strand. Below the MEME output, the equivalent regions are marked on the TIPTOP2 protein schematic.

(B) When the regions encoding this sequence were compared with their reverse complements (rev.c.), it revealed the sequences are highly complementary.

(C) Putative DNA hairpins in TI-1 and TI-5 generated using CONTRAfold.

(D) Reconstruction of the MEME raw output when repeat size maxima was uncapped, showing the location of high-scoring 113-mer repeats.

(E) A histogram displaying the empirical probability of 113-mers with a given folding free energy, estimated using 30,190 random 113-mers extracted from unspliced Arabidopsis mRNA.

(F) A summary of the folding free energies and statistical significance of the 113-mer repeats shown in (D). The P value is the area under the histogram corresponding to free energies greater than or equal to the given value; the adjusted P value (adj p) is adjusted for six multiple tests, because we chose the repeat copy with the lowest free energy.

The palindromic nature of these repeats (albeit not statistically significant) encouraged us to explore whether TIPTOP2 contained low folding free energy sequences. This time, we queried MEME to find repeats without forcing it to find palindromes. MEME returned a series of 113-mers that overlapped with the previous series of palindromic 50-mers (Figure 4D). To establish whether these repeat sequences were significant in structure, we compared their folding free energies to a probability histogram of folding free energies for random 113-mers (Figure 4E). The N-terminal 113-mer repeat has a predicted folding free energy that is statistically significant (P < 0.05) compared with random ORFs of similar length from Arabidopsis (Figure 4F). The folding free energies of the other five repeats are not significant at the 0.05 level; however, there is a clear trend of increasing predicted folding free energy from the N to C terminus.

AEP-Dependent Maturation of Cyclic Knottins from TIPTOP2

To test whether AEP is critical for the maturation of cyclic knottins, we expressed TIPTOP2 in Arabidopsis using the strong seed-specific promoter of OLEOSIN (Parmenter et al., 1995). We transformed the OLEOSIN:TIPTOP2 construct into wild-type Arabidopsis and an aep null mutant that has lesions in all four Arabidopsis AEP genes (Kuroyanagi et al., 2005). The knottin TI-5 is not preceded by Asn or Asp, and so we expected it to be matured in an AEP-independent manner (Figure 5A). If TIPTOP2 is correctly processed in Arabidopsis, it will yield the peptides TI-1 (3480.97 D), TI-2 (3453.00 D), TI-4 (3410.88 D), and the N-terminally pyrolated TI-5 (3306.86 D). Peptides were extracted from T2 seeds and analyzed by matrix-assisted laser desorption/ionization (MALDI)–mass spectrometry (MS).

Figure 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 5.

In Vivo Processing of TIPTOP2.

(A) Schematic of TIPTOP2 showing the Asn residue preceding each cyclic knottin domain, the terminal Asp of each cyclic knottin domain, and the Lys residue preceding the terminal knottin.

(B) MALDI-MS analysis of seed peptide extracts of either M. cochinchinensis or Arabidopsis containing OLEOSIN:TIPTOP2 in either wild-type (WT) or aep null mutant backgrounds. The identity of M. cochinchinensis masses 3378.5 and 3434.9 are not known; those that match known peptides are labeled. The asterisks in the OLEOSIN:TIPTOP2 in wild-type spectra denote misprocessed peptides. For TI-5, this mass is consistent with failure to pyrolate (+17 D), whereas for TI-1, TI-2, and TI-4, the masses marked by 1*, 2*, and 4*, respectively are +18-D masses consistent with noncyclized peptide. For comparison, nontransgenic wild-type and aep null mutant profiles are shown. See Supplemental Figure 9 online for MALDI-MS spectra with a broader mass range.

(C) Ions within the LC-MS data of the same extracts with ranges 827.2 to 827.4, 853.0 to 853.2, 863.5 to 863.7, and 870.5 to 870.7 D confirmed each peak in Arabidopsis matches its counterpart in M. cochinchinensis. The peak with the asterisk marks what we suspect is a TI-1 isomer. For fully annotated LC-MS traces, see Supplemental Figure 10 online.

[See online article for color version of this figure.]

In a wild-type background, we detected all the predicted cyclic masses as well as a mass for TI-5 containing a pyroglutamic acid residue (Figure 5B), suggesting Arabidopsis could process all TIPTOP2-derived knottins correctly. For each peptide, we also detected +18-D masses consistent with unligated peptide and a +17-D mass consistent with nonpyrolated TI-5 (i.e., a free N terminus). Presence of nonligated peptides was also reported in Mylne et al. (2011), when sunflower PawS1 was similarly expressed in Arabidopsis using the OLEOSIN promoter. The efficiency of cyclization in Arabidopsis varied between TIPTOP2-derived peptides. Assuming identical MS ionization strengths by cyclic and acyclic versions of the same peptide, the cyclization efficiency by Arabidopsis as judged by the peak height of cyclic to a combined peak height for cyclic and unligated peptide masses was ∼45% for TI-1, ∼70% for TI-2, and ∼55% for TI-4 (Figure 5B). Pyrolation efficiency for TI-5 by Arabidopsis was ∼80%. The cyclization efficiencies from TIPTOP2 are much higher than that of SFTI-1, which is ∼5% when expressed from an OLEOSIN:PawS1 construct (Mylne et al., 2011).

The OLEOSIN:TIPTOP2 construct in an aep null mutant background revealed none of the masses for either cyclic or unligated TI-1, TI-2, or TI-4 (Figure 5B), suggesting AEP is indeed required for their release from TIPTOP2. By contrast, the mass for TI-5 remained detectable in the aep null mutant background (Figure 5B), confirming that AEP is not required to release this knottin from within TIPTOP2.

We confirmed the presence of all four TIPTOP2-encoded knottins in Arabidopsis seeds using LC-MS, using M. cochinchinensis knottins as controls (Figure 5C). In the aep null mutant background, LC-MS data confirmed the maturation of TI-5 and the absence of any detectable cyclic knottin. Therefore, TIPTOP2 requires AEP to mature its cyclic knottins, but maturation of the terminal knottin is AEP-independent.

Features Conserved between TIPTOP Proteins and Other Cyclic Peptide Precursors

Precursors for two other classes of plant cyclic peptide are known: the kalata-type and the PawS-derived peptides (Figure 6A). The kalata-type cyclic peptides are matured from precursors that differ greatly in their structure and have been found in the Rubiaceae (Craik, 2001; Jennings et al., 2001), Violaceae (Dutton et al., 2004), Fabaceae (Nguyen et al., 2011; Poth et al., 2011), and Solanaceae (Poth et al., 2012). The small cyclic peptides from sunflower are derived from bifunctional 2S seed storage albumin precursors (Mylne et al., 2011). The cyclic knottins are similar to kalata-type cyclic peptides in their size, number of disulfides, and knotted structure. They also have some similarities to sunflower SFTI-1—they are found in seeds and are TIs. Although the TIPTOP proteins that produce cyclic knottins are very different from those of the other two peptide classes (Figure 6A), alignment of the sequences flanking the mature peptide domain suggests they use a similar method of processing (Figure 6B).

Figure 6.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 6.

Independent Evolution of Plant Cyclic Peptides That Use the Same AEP-Mediated Processing.

(A) Partial angiosperm phylogeny based on rbcL sequences. Species known to contain AEP-mediated cyclic peptides and their precursors are in green (SFTI PDB#1SFI Helianthus, Asteraceae; O1 PDB#1NBJ, Viola, Violaceae; CterM PDB#2LAM, Clitoria, Fabaceae; kalata B1 PDB#1NB1, Oldenlandia, Rubiaceae; Petunia, Solanaceae; TI-2 PDB#1HA9, Momordica, Cucurbitaceae). A range of model plants is included, and their names are highlighted in orange. For a more complete angiosperm phylogeny, see Supplemental Figure 11 and Supplemental Table 6 online for the alignment used. For taxa representing each used family name, see Methods. aa, amino acids.

(B) BOXSHADE alignment of each cyclic peptide domain (orange) and its flanks. This shows the three peptide classes that use this AEP-mediated processing and show the conserved cyclic peptide domain has a proto–N-terminal Gly and a proto–C-terminal Asn or Asp (Asx) (Asp and Asn are the target residues for AEP). The residues trailing the proto–C-terminal Asn or Asp are usually a small side-chained residue at the P1′ and either Leu or Ile at the P2′ position.

In TIPTOP proteins, each cyclic knottin domain is preceded by Asn and ends with Asp, as is the case for SFTI-1 within PawS1, where cleavage at these sites was shown to require AEP (Mylne et al., 2011). Another TIPTOP feature conserved with peptides in the other two classes is that the first residue of each cyclic knottin is always Gly. The prototerminal Asn or Asp in all three classes of cyclic peptide domain is typically followed by a small residue at the P1′ position, and at the P2′ position, all have a Leu or Ile residue. These three classes of plant cyclic peptides were all discovered based on bioactivities, but there seems to be a strong evolutionary bias toward a similar mode of processing involving AEP.

DISCUSSION

Here we have described the TIPTOP genes of M. cochinchinensis, each of which encode a string of backbone-cyclized knottins and end with the more usual acyclic knottin. The concatemeric arrangement of 150-base imperfect repeats in TIPTOP genes suggests that they have undergone extensive internal duplication. Previous studies of internal protein expansion, particularly for larger repeats, have attributed the origin of the repeats to faulty recombination (Marcotte et al., 1999). Although the first repeat-causing event is not well understood, expansion of repeats often continues from within the middle of a series of repeat sequences (Björklund et al., 2006). The sequences of TIPTOP repeats are consistent with these observations. For example, the TIs in TIPTOP3 are arranged 8-2-2-2-2-2-7-6, with the central repeats the most similar and with variants at the extremities (Figure 1; see Supplemental Figure 7 online).

At present, this consistency of TIPTOP repeat sequences with the features of known expanding proteins suggests genetic expansion. The squash TI class of knottins has been characterized thoroughly at the peptide level, and this class is so named because members are restricted to the squash (Cucurbitaceae) family. Showing that the lineage with the multiunit TIPTOPs diverged from other lineages that contain single-unit genes but no multiunit TIPTOPs would support our proposal that TIPTOPs evolved by expansion of a common ancestral single-unit squash TI precursor. Momordica is outside of a clade that includes most of cucurbit diversity, so the single-unit cDNA transcripts presented in Figure 1E from four other cucurbit species (tribes Trichosantheae and Benicaseae) cannot be used to confirm expansion, because these species’ lineages are all more closely related to one another than to Momordica, and thus might have experienced contraction from a multiknottin ancestor. Consequently, we cannot experimentally confirm expansion over the alternative possibility of gene contraction in Cucurbitaceae after the Momordica lineage diverged.

The feature that separates the TIPTOP proteins from all other cyclic peptide precursors is that they contain almost identical matured proteins but include both cyclic and acyclic versions. These dual topologies arising from within one protein have some similarities to the PawS proteins, from which a cyclic peptide and a linear heterodimeric napin-type albumin arise. However, in PawS proteins, the two matured products differ greatly in size, sequence, three-dimensional structure, and function (Mylne et al., 2011).

We found that the acyclic Momordica TIs have a pyroglutamic acid at their N terminus. A pyroglutamic acid can arise by conversion of either Glu or Gln. The TIPTOP genes encode Gln at their knottin proto–N termini. Conversion of Gln to pyroglutamic acid is a deamination reaction that confers resistance to degradation by aminopeptidases (Schilling et al., 2008). Structurally, we showed that the amino and carboxyl termini of the TI-5 knottin are exposed. An advantage that cyclization offers over pyrolation is it would provide resistance to amino- and carboxypeptidases. Even endoproteases have difficulty cleaving cyclic peptides containing internal disulfide bonds. In vitro and in vivo studies of disulfide-rich conotoxin drug leads (Clark et al., 2005; Clark et al., 2010) have shown that improved stability may be achieved by synthetic cyclization. Previous studies with TI-2 have shown that acyclic mimics have reduced trypsin inhibitory activity. TI-2 is 10 times more effective at inhibiting trypsin than an open chain TI-2 variant that lacks the ligating loop 6 (Avrutina et al., 2005). Of these two possibilities, it is unclear whether the evolved cyclic innovation in Momordica is providing cyclic knottins with greater in planta stability, higher activity, or perhaps both, compared with their linear homologs.

There are four known classes of gene-encoded backbone-cyclized peptides in the plant kingdom that are found in distantly related families. Of these, we believe only three are using AEP-mediated processing. Each of these classes arises from a different type of precursor. Most kalata-type cyclic peptides arise from dedicated precursors that may have one, two, or three peptide units (Jennings et al., 2001). The kalata-type cyclic peptides from the legume Clitoria ternatea (Nguyen et al., 2011; Poth et al., 2011) are encoded by a protein similar to pea (Pisum sativum) Pa1-albumin, except that the usual first of two albumin domains has been replaced by a kalata-type sequence. The PawS-derived cyclic peptides (Mylne et al., 2011) are embedded within unusual bifunctional proteins. The cyclic knottins here are derived from a concatemeric protein that seems to have arisen from internal expansion of a precursor for a knottin. Despite these different peptides and precursors, there seem to be significant shared features between the TIPTOP proteins and both the kalata-type and PawS-derived classes, whose maturation and proposed cyclization is by AEP. These shared features include the proto–N-terminal Gly, proto–C-terminal Asp/Asn, and the trailing small P1′ residue and P2′ Leu.

The fourth class of gene-encoded backbone-cyclized peptides in the plant kingdom is found in members of the Caryophyllaceae and Rutaceae plant families and highlights that this AEP-mediated mechanism is unlikely to be the only way plants can produce cyclic peptides. Condie et al. (2011) revealed that five to nine residue segetalins lacking disulfides emerge from short precursors. Critically, the segetalins and their precursors do not contain conspicuous Asp or Asn residues, the target residues of AEP. Furthermore, segetalin precursors share none of the other properties seen for AEP-mediated cyclic peptides.

Involvement of AEP in cyclic peptide maturation has previously been shown for SFTI-1, which is derived from the unusual seed storage preproalbumin PawS1. Not only is AEP well established as being essential for seed storage albumin maturation (Shimada et al., 2003; Gruis et al., 2004), but when PawS1 constructs were transformed into an Arabidopsis aep null mutant, the misprocessing detectable by MALDI-MS and proteomics analyses demonstrated that AEP was required to release the cyclic peptide domain at both prototermini (Mylne et al., 2011). The prototermini of the cyclic peptide in PawS1 are identical to those of the cyclic knottins within TIPTOP1-3. Each cyclic knottin domain is preceded by Asn and ends with Asp, both target residues for AEP (Hara-Hishimura et al., 1993; Hiraiwa et al., 1999). There are additional similarities when TIPTOPs are compared with PawS proteins and precursors of kalata-type cyclic peptides, namely the presence of a proto–N-terminal Gly in all cyclic peptide domains as well as the P1′ and P2′ residues trailing the cyclic peptide domains consisting of a small amino acid (often Gly) and an absolutely conserved Leu, respectively. This P2′ Leu has been shown to be essential for the cyclization of kalata B1 associated with processing from its precursor protein Oak1 (Gillon et al., 2008) but seems not to be critical for PawS1 when expressed in Arabidopsis (Mylne et al., 2011). The significance of these observations is that all three classes of backbone cyclic peptide are using the same AEP-mediated mechanism for their maturation.

We hypothesized that the convergence on involvement of AEP may in part be caused by the reactive thioester acyl intermediate that AEP (a Cys protease) will form after cleavage at the scissile peptide bond. After this cleavage, the conditions at the enzyme active site would be entropically ideal for the acyl intermediate to react with the N terminus of an unmasked Gly, instead of with water (Gillon et al., 2008; Mylne et al., 2011). This mechanism requires sequential cleavages that first unmask the Gly at the proto–N terminus of the peptide before the cleavage at the proto–C-terminal Asp. For PawS2, the Gly is unmasked by ER signal removal, but in PawS1, it is believed to be unmasked by preferential cleavage at Asn, the preferred substrate of AEP. The TIPTOP proteins all share the same Asn/Asp, which we believe permits sequential cleavage in PawS1. At the proto–C terminus of kalata B1, a role for AEP has been implied from transient transformation of tobacco (Nicotiana tabacum) with the kalata B1 precursor Oak1 with and without AEP-silencing constructs (Saska et al., 2007). However, which enzyme unmasks the Gly at the proto–N terminus is unknown. There is little, if any, sequence conservation at the residues preceding each kalata-type peptide domain, suggesting there may be several enzymes capable of removing the prodomain. The identity of the enzyme that releases the knottin at the end of each TIPTOP-predicted protein is equally uncertain. TI-5 and TI-6 are each preceded by Lys (Figure 1D), and TI-5 release is proven to be AEP-independent (Figure 5). In other squash TI precursors, the knottins are preceded by Gly or Ala (Ling et al., 1993). Unlike SFTI-1, which coopts AEP from its adjacent role in seed storage albumin maturation, the cyclic knottins derived from within the TIPTOP proteins have acquired a completely new requirement for AEP.

These commonalties in cyclic peptide–processing residues have been found for three structurally different classes of peptide. Not only do the peptides differ structurally, but they are also embedded within precursor proteins of very different architectures and in unrelated plant families (Figure 6). The kalata-type family of cyclic peptides are usually embedded in proteins that encode signal peptides and prodomains but no mature peptides other than kalata types, and these are from the phylogenetically distant families Rubiaceae (Jennings et al., 2001) and Violaceae (Dutton et al., 2004). An interesting exception is the kalata-type cyclic peptides found in the legume C. ternatea, which are encoded by a Pa1-like albumin protein, but one in which the Pa1b domain has been replaced (Nguyen et al., 2011; Poth et al., 2011). A very recent discovery is a new precursor structure that encodes kalata-type cyclic peptides in the Solanaceae (Poth et al., 2012). The TIPTOP proteins we found in Momordica (Cucurbitaceae) are distinct from those in kalata-bearing families Fabaceae and Violaceae (Figure 6). Six plant lineages that are phylogenetically quite distantly related have converged to use the same AEP-dependent processing to make three classes of cyclic peptides, and we believe this provides strong evidence of evolutionary parallelism.

In this context, the term parallelism refers to the independent evolution of the same derived trait via the same developmental changes, whereas convergent evolution refers to superficially similar traits that have a distinct developmental basis (Patterson, 1982; Yoon and Baum, 2004). These three peptide classes are all using AEP-mediated processing and have certain conserved residues. Importantly, the occurrence of parallelism shows that the path of evolution for a particular trait seems to be constrained to certain channels. We have proposed AEP is especially suited for performing ligation reactions; therefore, AEP might be the constraining evolutionary channel inferred by parallelism.

For the three cyclic peptide classes using this AEP-mediated processing, the founding peptide member for each class was discovered based on bioactivity, which does not bias discovery toward any particular biosynthetic mechanism. The PawS-derived peptide ring SFTI-1 was first identified by in-gel trypsin inhibition assays with the common sunflower H. annuus (Luckett et al., 1999). The founding member of the large kalata-type family was discovered as the uterotonic component of extracts of a traditional Congo medicine derived from a tea made from Oldenlandia affinis (Gran, 1970). The cyclic knottins TI-1 and TI-2 were discovered for TI activity in the traditional Chinese medicinal plant M. cochinchinensis (Hernandez et al., 2000). Despite their independent discovery, different precursors, and structural diversity, all three classes depend on AEP for their maturation. By International Union of Biochemistry and Molecular Biology definition EC 3.4, a protease catalyzes peptide bond hydrolysis, but based on thermodynamic reversibility, it can also catalyze peptide bond formation. More than 70 years ago, Max Bergmann used several proteases, including chymotrypsin, to induce bond formation in vitro (Bergmann and Fruton, 1938). More recently and relevant to our case, jack bean (Canavalia ensiformis) AEP was demonstrated to perform a transpeptidation reaction in vitro (Min and Jones, 1994). For structurally constrained peptide substrates, the N terminus is held close to the scissile bond during cleavage, favoring ligation that leads to cyclic peptides. Although plants typically contain hundreds of proteases (García-Lorenzo et al., 2006), evolution has produced cyclic peptides several times via AEP processing, suggesting that this type of protease is especially favorable for performing protein cyclization.

METHODS

Plant Material

Momordica cochinchinensis fruits were obtained from a commercial vendor of Vietnamese produce at the Footscray (Melbourne, Australia) markets. The seeds were removed from fruits and washed. The seed coat was removed initially with sandpaper and then, once the coat was broken in a large enough area, was peeled away. The mature embryos were washed briefly in water and dried on a paper towel.

Genomic DNA Extraction

100 mg of M. cochinchinensis leaf tissue was ground under liquid nitrogen to a fine powder, transferred to a 1.5-mL tube, and resuspended in 1 mL of cetyltrimethylammonium bromide buffer (140 mM sorbitol, 220 mM Tris-HCl, pH 8.0, 22 mM EDTA, pH 8.0, 800 mM sodium chloride, 1% sarkosyl, 0.8% cetyltrimethylammonium bromide). We incubated the mixture at 65°C for 15 min with occasional mixing. We added 0.4 mL of chloroform and mixed by inversion. The sample was centrifuged for 10 min at 17,000g, and the supernatant was transferred to a fresh tube. We added 0.7 mL of cold isopropyl alcohol and incubated the sample at −20°C for 30 min. After 30 min of centrifugation at top speed at 4°C, the dried pellet was dissolved in 0.3 mL of TE buffer (10 mM Tris, 5 mM EDTA, pH 8.0) and extracted with phenol:chloroform. The DNA was precipitated from the aqueous phase by addition of 0.1 volume of 3 M sodium acetate (pH 5.5) and two volumes of ethanol. The sample was mixed, incubated at −20°C for 1 h, and centrifuged at top speed at 4°C for 30 min. The pellet was washed with 1 mL of 70% ethanol, left to dry, and then dissolved in a final volume of 50 μL of TE buffer.

RNA Extraction

Three dehusked M. cochinchinensis seeds were ground under liquid nitrogen with glass beads to a fine powder. Before thawing, 0.3 mL of tissue powder was resuspended in 0.25 mL of acidic phenol (pH 4.3; Sigma-Aldrich) and 0.5 mL of 0.1 M Tris, pH 8.0, 5 mM EDTA, 0.1 M sodium chloride, 0.5% SDS, 1% 2-mercaptoethanol that had been preheated to 65°C. This mixture was vortexed for 20 min before addition of 0.25 mL of chloroform for an additional 10 min. After centrifugation at 17,000g, the supernatant was transferred to a new tube for a second extraction with one volume of 1:1 phenol:chloroform. The mixture was centrifuged again at 17,000g, and its supernatant was precipitated with 2.5 volumes of ethanol, 0.1 volume of 3 M sodium acetate, and incubation at −80°C for 15 min. The nucleic acid pellet was dissolved in 0.5 mL of water, and RNA was precipitated by addition of 0.5 mL of 4 M lithium chloride and incubation overnight at 4°C. After centrifugation at 17,000g for 10 min at 4°C, the RNA pellet was washed with 1 mL of 80% ethanol, dried, and resuspended in 60 μL of water.

Cloning of TIPTOP Genes

Genomic DNA (1 μg) was digested overnight with McrBC, which cleaves DNA containing methylcytosine. The digest was purified by QIAquick spin column (Qiagen) to remove digested, low–molecular-weight DNA. Because TI-2 is cyclic, its biological ligation point is impossible to know without cloning the gene. However, knowledge of cyclic peptide processing allowed us to previously postulate the sequence order TI-2 might have in its precursor protein (Daly et al., 2006). We designed degenerate primers to TI-2; namely PN02 (see Supplemental Table 3 online for all primer sequences) to the sense sequence of Gly-Gly-Val-Cys-Pro-Lys and PN10 to the reverse complement of the sequence Ile-Cys-Arg-Gly-Asn-Gly. The PCR products from this initial reaction were used in a second, nested PCR reaction with primers PN03 to the sense sequence of Val-Cys-Pro-Lys-Ile-Leu-Lys and PN11 to the reverse complement of the sequence Ala-Cys-Ile-Cys-Arg-Gly. The largest of several PCR products from this nested reaction encoded a full TI-2 encoding unit flanked by two partial TI-2 encoding units with the primers at its termini, suggesting the TI-2 precursor was multidomain and contained at least three units of TI-2. This DNA sequence was used to design a suite of specific primers for 5′ and 3′ RACE.

We extracted RNA with 1:1 phenol:chloroform and selective precipitation of RNA by lithium chloride. 500 ng of total RNA was used to create 5′ and 3′ RACE libraries using the SMARTer RACE cDNA Amplification Kit (Clontech). The RACE libraries were amplified with specific primers designed against the gene fragment. Products were cloned from the PCR reactions of the 5′ RACE library with JM368 and 3′ RACE libraries with JM369 and JM371. The 5′ and 3′ RACE products were cloned into pGEM-T (Promega), sequenced, and aligned. Although it was clear from polymorphisms that more than one gene was being amplified, the 5′ UTR and 3′ UTRs were identical. To amplify full-length clones, we designed JM429 to the most 5′ region and a reverse primer JM430 immediately upstream from the polyA sequence in 3′ RACE clones. PCR amplification of the aforementioned 5′ cDNA library with these primers produced three products. Complete sequencing through all repeating sequence in a single pass required design of primers inside the ORF JM437 and JM438. The three products each encoded a full ORF that included TI-2 as well as other novel peptides. The three genes were named TIPTOP1, TIPTOP2, and TIPTOP3, and they encode five, six, and eight peptides, respectively. For each TIPTOP gene, at least five independent clones were obtained. Independent cloning events were ensured by the observed loss of one to three nucleotides at the very 5′ end of either of the cloning primers in sequenced products.

PCR amplification using the JM377 and JM378 primer pair with genomic DNA produced the same three TIPTOP1 to TIPTOP3 products, revealing that these three TIPTOP genes lack introns (see Supplemental Figure 2A online). We detected faint PCR product bands below TIPTOP1 when genomic DNA (gDNA) was used as the PCR template. Upon cloning these faint bands, we found these DNAs encoded TIPTOP genes with fewer peptide units, but the DNA sequences matched one of TIPTOP1 to TIPTOP3, indicating these were truncated TIPTOP products produced as an artifact of PCR. This artifactual nature of this lower–molecular-weight laddering by PCR was especially noticeable (see Supplemental Figure 2B online) when a plasmid containing TIPTOP2 was PCR-amplified with primers JM439 and JM440 to make an OLEOSIN:TIPTOP2 construct (see Seed-Specific Expression of TIPTOP2 in Arabidopsis for details). PCR using JM377 and JM378 with cDNA (see Supplemental Figure 2A online) could not amplify additional TIPTOP genes.

Alignment of Squash TI Precursors and TIPTOP1 to TIPTOP3

The alignment shown in Figure 1D contains the predicted protein sequences for M. cochinchinensis TIPTOP1 to TIPTOP3 (HQ853490 to HQ853492). They are compared with six sequences from other Cucurbitaceae. The sequence labeled watermelon (Citrullus lanatus subsp vulgaris) is from translation of a cDNA sequence filed under GenBank accession number AI563213. The sequence labeled Trichosanthes kirilowii came from retranslation of a T. kirilowii cDNA GenBank accession number X82230, which encodes the knottin TGTI-II (Ling et al., 1993). The predicted protein in GenBank is lacking 16 upstream amino acids encoded by an earlier in-frame start codon. The additional 16 amino acids add a conserved ER signal sequence. The sequence labeled Loofah (Luffa aegyptiaca) came from a retranslation of cDNA GenBank accession number M98055. The predicted protein in M98055 similarly lacks 16 amino acids of the ER signal sequence. The protein sequence for Melon1 is supported by 57 Cucumis melo expressed sequence tags, including JG526994 (see Supplemental Table 4 online for the full list). The protein sequence for Melon2 is supported by 69 C. melo expressed sequence tags, including JG532730 (see Supplemental Table 5 online). The sequence labeled Cucumber (Cucumis sativus) comes from translation of a C. sativus cDNA sequence filed under GenBank accession CK758797.

Peptide Extraction and NMR Analysis

The M. cochinchinensis tissue used for RNA was also used to extract crude peptides, which were analyzed by LC-MS (Chan et al., 2009). The LC-MS data were analyzed for predicted masses to identify their retention times. The knottins TI-5 and TI-6 had identical predicted masses but one candidate peak in the LC-MS. Individual peptides were purified as described elsewhere (Chan et al., 2009). The peptide concentrations of pure fractions were quantified using a Nanodrop UV-spectrometer (NanoDrop Technologies) and prepared for sequencing by MS/MS. Briefly, 0.5 mg of peptide was dissolved in 0.5 mL of 100 mM ammonium bicarbonate (pH 8.1) and reduced with 25 μL of 100 mM DTT followed by incubation at 60°C under nitrogen gas for 30 min. A sample of 0.25 mL received 0.125 mL of 40 mg/mL tosyl phenylalanyl chloromethyl ketone–treated bovine trypsin (Sigma-Aldrich) or 0.125 mL of 40 μg/mL chymotrypsin and 0.125 mL of 40 μg/mL endoproteinase Glu-C and was incubated at 37°C for 3 h before quenching each digest with 10 μL of 0.5% formic acid. Samples were desalted using C18 ZipTips (Millipore) and eluted with 80% (v/v) acetonitrile 0.5% (v/v) formic acid. The digest fragments were examined by MALDI-time of flight MS. A Nanospray QSTAR Pulsar I QqTOF mass spectrometer (Applied Biosystems) was used to sequence all novel knottins by selecting doubly charged and triply charged precursor ions from an initial time of flight-MS scan, followed by MS/MS on each selected product ion. A capillary voltage of 900 V was used with a collision energy between 10 to 50 V, depending on the charge and size of the ions. Analyst QS 1.5 software was used for the processing and acquisition of data. For TI-5, it was necessary to analyze by NMR to distinguish it from TI-6. NMR analysis of TI-5 and TI-6 was performed using total correlation spectroscopy and nuclear Overhauser effect spectroscopy spectra. For TI-5, we obtained a solution structure using 1 mg of purified peptide in 90% H2O/10% D2O (v/v). Analysis of the spectra was also used to distinguish TI-5 from TI-6 based on the Lys and Gln peaks. Structures of TI-5 were calculated using CYANA (Ikeya et al., 2006) and CNS (Brünger et al. 1997). A set of 50 structures was calculated, and the 20 lowest-energy structures were selected for further analysis, followed by structure analysis using the programs PROCHECK_NMR (Laskowski et al., 1996) and PROMOTIF (Hutchinson and Thornton, 1996) to generate statistical analysis.

Repeat Analysis of TIPTOP2

The TIPTOP2 cDNA sequence (TIPTOP2 is intronless) was submitted to the MEME server (Bailey and Elkan, 1994) at http://meme.sdsc.edu/meme/cgi-bin/meme.cgi using default settings, but with the “find palindromes” box checked. A high-scoring hit was the 50-mer regions shown in Figure 4A, which are imperfect palindromes. The reason that each 50-mer seems to be either on the plus or minus strand and not both was because the combined block diagrams in MEME do not display any motif occurrences that overlap. The putative hairpin models in Figure 4C were generated with CONTRAfold (http://contra.stanford.edu/contrafold) (Do et al., 2006) in default mode, except with “allow all possible base pairs” selected.

To establish the significance of the 50-mer repeats, we compared the folding free energies of each repeat to a histogram of random 50-mer folding free energies. Genomic resources for M. cochinchinensis are not available; therefore, to generate the histogram, we downloaded all unspliced ORFs for Arabidopsis from the Regulatory Sequence Analysis Tools website (http://rsat.ulb.ac.be) without masking repeats. We then extracted a random segment of size 50 bases from each of the unspliced ORFs whose length was at least 50 bases. This resulted in 35,176 DNA sequences. We then ran the RNAfold (Hofacker et al., 1994) algorithm to predict minimum energy secondary structures for 50-mer repeats found by MEME in the TIPTOP2 gene. To estimate the free energy of binding of a random ORF segment of the same length as the TIPTOP2 repeats, we then ran RNAfold on each of the 35,176 randomly chosen ORF segments. We plotted a histogram of the folding free energies of the randomly chosen ORF segments. In this plot (see Supplemental Figure 8B online), the y axis is the fraction of random ORFs with a given predicted folding free energy, and the x axis is the folding free energy in kcal/mol.

To examine the TIPTOP2 repeats more closely, we ran MEME using the command “meme -dna -revcomp -mod anr -maxw 200 -minsites 2 -maxsites 20 -nmotifs 1 data/tiptop2.fasta.” This specifies that MEME look for de novo repeats of length up to 200 bases that occur between two and 20 times in the single sequence in TIPTOP2. This approach does not bias MEME’s search toward or away from DNA palindromes, which is important in the folding energy analysis described in the next paragraph. MEME found a repeat of 113 bases that occurs six times in the TIPTOP2 coding sequence.

To establish the significance of the 113-mer repeats, we compared the folding free energies of each repeat to a histogram of random 113-mer folding free energies in a similar approach to the 50-mers as detailed above. We generated a histogram from unspliced ORFs for Arabidopsis using a random segment of size 113 bases from each of the unspliced ORFs whose length was at least 113 bases. This resulted in 30,190 DNA sequences of length 113 bases, which we ran through RNAfold and compared the folding free energies to those of each 113-mer TIPTOP2 repeat.

Seed-Specific Expression of TIPTOP2 in Arabidopsis

The TIPTOP2 cDNA was amplified using JM439, which added a ClaI site followed by the translation initiation sequence ACA to the TIPTOP2 start ATG. At the 3′ end of TIPTOP2, the primer JM440 bound to the end of the TIPTOP2 3′ UTR and added a BamHI site. TIPTOP2 was put under control of the OLEOSIN promoter and was expressed in wild-type and aep Arabidopsis backgrounds as described elsewhere (Mylne et al., 2011). Peptides were extracted by grinding 100 T2 seeds under liquid nitrogen with glass beads, adding 0.25 mL of methanol and 0.25 of mL dichloromethane and separating the phases with 0.1 mL of 0.05% trifluoroacetic acid, then mixing for 2 min at room temperature. After 5 min of centrifugation at 17,000g, the aqueous phase was diluted fivefold to 10-fold in 50% acetonitrile 0.1% trifluoroacetic acid and was spotted for analysis by MALDI-MS.

Phylogenetic Analyses

To reconstruct the angiosperm phylogeny in Figure 6, we downloaded 187 rbcL sequences (see Supplemental Data Set 2 online) for 157 angiosperm families plus two outgroup gymnosperms from GenBank (http://www.ncbi.nlm.nih.gov/genbank/) and built a nucleotide alignment using MacClade v. 4.08 (http://www.macclade.org). In most cases, we included just one species to represent an entire family. We added all commonly used angiosperm model organisms based on the plant species listed in The Gene Index Project (http://compbio.dfci.harvard.edu/tgi/plant.html). A maximum likelihood (Felsenstein, 1973) tree search was performed using RAxML v.7.2.6 (Stamatakis et al., 2008) on the CIPRES cluster (http://www.phylo.org/). Based on the Akaike information criterion (Akaike, 1974) as implemented in jModeltest (Posada, 2008), we selected the general time-reversible + Γ model (six general time-reversible substitution rates, assuming gamma rate heterogeneity), with model parameters estimated over the duration of specified runs. We did not infer bootstrap values to assess statistical branch support, because the tree is only needed to visualize the general distribution of the discussed proteins across the angiosperms and not to test specific relationships between plant families. In all cases with only one species representing the family, the tips are labeled with the family name. The species used for placement of each family is detailed in Supplemental Table 4 online.

For the phylogenetic analysis of the TIPTOP DNA repeats, we again performed a maximum likelihood tree search using RAxML v.7.2.6 on the CIPRES cluster. We aligned the repeats using the pairwise aligning function in MacClade v. 4.08 (for sequences and alignment, see Supplemental Data Set 2 online) and chose the general time-reversible + Γ model (six general time-reversible substitution rates, assuming gamma rate heterogeneity), with model parameters estimated over the duration of specified runs. The best maximum likelihood tree was obtained using the rapid bootstrap algorithm (RAxML option: -f a).

Accession Numbers

Sequence data from this article can be found in the EMBL/GenBank data libraries under accession numbers HQ853490 for M. cochinchinensis TIPTOP1, HQ853491 for TIPTOP2, HQ853492 for TIPTOP3, and JN819554 for M. sphaeroidea TIPTOP2 gDNA. Arabidopsis Genome Initiative locus identifiers referred to include At2g25940 (AEP1, α-vacuolar processing enzyme [VPE]); At1g62710 (AEP2, β-VPE); At4g32940 (AEP3, γ-VPE); At3g20210 (AEP4, δ-VPE); and At2g25890 (OLEOSIN). The atomic coordinates of TI-5 were deposited at Protein Data Bank (2LJS), and NMR restraints were deposited at Biological Magnetic Resonance Data Bank (17,956).

Supplemental Data

The following materials are available in the online version of this article.

  • Supplemental Figure 1. Alignment of TIPTOP Repeating Domains.

  • Supplemental Figure 2. PCR with cDNA and gDNA Favor Amplification of TIPTOP1 to TIPTOP3.

  • Supplemental Figure 3. MS Data for TI-4.

  • Supplemental Figure 4. MS Data for TI-5.

  • Supplemental Figure 5. MS Data for TI-6.

  • Supplemental Figure 6. MS Data for TI-8.

  • Supplemental Figure 7. Analysis of the TIPTOP1 to TIPTOP3 Repeats.

  • Supplemental Figure 8. The Statistical Analysis of the Folding Free Energy of 50-mers Indicates the Imperfect Palindromes in TIPTOP2 Are Not Significant.

  • Supplemental Figure 9. A Wider Mass Range for the MALDI Spectra Shown Zoomed in for Figure 5B.

  • Supplemental Figure 10. Fully Labeled LC-MS Profile of Peptide Extracts from TIPTOP2 Expressing Arabidopsis.

  • Supplemental Figure 11. Angiosperm Phylogeny Based on rbcL Sequences.

  • Supplemental Table 1. MS/MS Product Ions for TI-4, TI-5, TI-6, and TI-8.

  • Supplemental Table 2. NMR and Refinement Statistics for MCoTI-V.

  • Supplemental Table 3. Primer Sequences Used in This Study.

  • Supplemental Table 4. Cucumis Sequences Supporting the Protein Sequence for Melon1.

  • Supplemental Table 5. Cucumis Sequences Supporting the Protein Sequence for Melon2.

  • Supplemental Table 6. Species Used for Placement of Families for Angiosperm Phylogeny.

  • Supplemental Data Set 1. NEXUS Format Text File of the Sequences and Alignment Used for the Phylogenetic Analysis of TIPTOP Repeats Shown in Supplemental Figure 7 Online.

  • Supplemental Data Set 2. NEXUS Format Text File of the rbcL Sequences and Alignment Used to Generate the Angiosperm Phylogeny Shown in Figure 6 and Supplemental Figure 11 Online.

Acknowledgments

We thank Ikuko Hara-Nishimura for aep null mutant seeds and Amy Argyros for technical assistance. This study was supported by a National Health and Medical Research Council grant (APP1009267). J.S.M. is an Australian Research Council Queen Elizabeth II Fellow (DP0879133) and The John S. Mattick Fellow. N.L.D. is a Queensland Smart State Fellow. D.J.C. is a National Health and Medical Research Council Professorial Fellow.

AUTHOR CONTRIBUTIONS

J.S.M. and D.J.C. designed research; J.S.M., L.Y.C., A.H.C., N.L.D., P.N., and L.C. performed research; J.S.M., L.Y.C., and N.L.D. analyzed data; H.S. provided materials and phylogenetic analyses; T.L.B. performed folding free energy analyses; J.S.M. and D.J.C. wrote the article.

Footnotes

  • The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: David J. Craik (d.craik{at}imb.uq.edu.au).

  • www.plantcell.org/cgi/doi/10.1105/tpc.112.099085

  • ↵[C] Some figures in this article are displayed in color online but in black and white in the print edition.

  • ↵[W] Online version contains Web-only data.

Glossary

TI
trypsin inhibitor
AEP
asparaginyl endopeptidase
ER
endoplasmic reticulum
UTR
untranslated region
LC-MS
liquid chromatography–mass spectrometry
MS/MS
tandem mass spectrometry
ORF
open reading frame
MALDI
matrix-assisted laser desorption/ionization
MS
mass spectrometry
gDNA
genomic DNA
VPE
vacuolar processing enzyme
  • Received April 12, 2012.
  • Revised May 28, 2012.
  • Accepted June 29, 2012.
  • Published July 20, 2012.

References

  1. ↵
    1. Akaike H.
    (1974). A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19: 716–723.
    OpenUrlCrossRef
  2. ↵
    1. Avrutina O.,
    2. Schmoldt H.-U.,
    3. Gabrijelcic-Geiger D.,
    4. Le Nguyen D.,
    5. Sommerhoff C.P.,
    6. Diederichsen U.,
    7. Kolmar H.
    (2005). Trypsin inhibition by macrocyclic and open-chain variants of the squash inhibitor MCoTI-II. Biol. Chem. 386: 1301–1306.
    OpenUrlCrossRefPubMed
  3. ↵
    1. Bailey T.L.,
    2. Elkan C.
    (1994). Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, R. Altman, D. Brutlag, P. Karp, R. Lathrop, and D. Searls, eds (Menlo Park, CA: AAAI Press), pp. 28–36.
  4. ↵
    1. Bergmann M.,
    2. Fruton J.S.
    (1938). Some synthetic and hydrolytic experiments with chymotrypsin. J. Biol. Chem. 124: 321–329.
    OpenUrlFREE Full Text
  5. ↵
    1. Björklund Å.K.,
    2. Ekman D.,
    3. Elofsson A.
    (2006). Expansion of protein domain repeats. PLoS Comput. Biol. 2: e114.
    OpenUrlCrossRefPubMed
  6. ↵
    1. Brünger A.T.,
    2. Adams P.D.,
    3. Rice L.M.
    (1997). New applications of simulated annealing in X-ray crystallography and solution NMR. Structure 5: 325–336.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Camarero J.A.,
    2. Kimura R.H.,
    3. Woo Y.-H.,
    4. Shekhtman A.,
    5. Cantor J.
    (2007). Biosynthesis of a fully functional cyclotide inside living bacterial cells. ChemBioChem 8: 1363–1366.
    OpenUrlCrossRefPubMed
  8. ↵
    1. Chan L.Y.,
    2. Wang C.K.,
    3. Major J.M.,
    4. Greenwood K.P.,
    5. Lewis R.J.,
    6. Craik D.J.,
    7. Daly N.L.
    (2009). Isolation and characterization of peptides from Momordica cochinchinensis seeds. J. Nat. Prod. 72: 1453–1458.
    OpenUrlCrossRefPubMed
  9. ↵
    1. Chiche L.,
    2. Heitz A.,
    3. Gelly J.C.,
    4. Gracy J.,
    5. Chau P.T.,
    6. Ha P.T.,
    7. Hernandez J.F.,
    8. Le-Nguyen D.
    (2004). Squash inhibitors: From structural motifs to macrocyclic knottins. Curr. Protein Pept. Sci. 5: 341–349.
    OpenUrlCrossRefPubMed
  10. ↵
    1. Clark R.J.,
    2. Fischer H.,
    3. Dempster L.,
    4. Daly N.L.,
    5. Rosengren K.J.,
    6. Nevin S.T.,
    7. Meunier F.A.,
    8. Adams D.J.,
    9. Craik D.J.
    (2005). Engineering stable peptide toxins by means of backbone cyclization: Stabilization of the alpha-conotoxin MII. Proc. Natl. Acad. Sci. USA 102: 13767–13772.
    OpenUrlAbstract/FREE Full Text
  11. ↵
    1. Clark R.J.,
    2. Jensen J.,
    3. Nevin S.T.,
    4. Callaghan B.P.,
    5. Adams D.J.,
    6. Craik D.J.
    (2010). The engineering of an orally active conotoxin for the treatment of neuropathic pain. Angew. Chem. Int. Ed. Engl. 49: 6545–6548.
    OpenUrlCrossRefPubMed
  12. ↵
    1. Claverie J.-M.,
    2. Ogata H.
    (2003). The insertion of palindromic repeats in the evolution of proteins. Trends Biochem. Sci. 28: 75–80.
    OpenUrlCrossRefPubMed
  13. ↵
    1. Condie J.A.,
    2. Nowak G.,
    3. Reed D.W.,
    4. Balsevich J.J.,
    5. Reaney M.J.T.,
    6. Arnison P.G.,
    7. Covello P.S.
    (2011). The biosynthesis of Caryophyllaceae-like cyclic peptides in Saponaria vaccaria L. from DNA-encoded precursors. Plant J. 67: 682–690.
    OpenUrlCrossRefPubMed
  14. ↵
    1. Craik D.J.
    (2001). Plant cyclotides: Circular, knotted peptide toxins. Toxicon 39: 1809–1813.
    OpenUrlCrossRefPubMed
  15. ↵
    1. Craik D.J.,
    2. Daly N.L.,
    3. Bond T.,
    4. Waine C.
    (1999). Plant cyclotides: A unique family of cyclic and knotted proteins that defines the cyclic cystine knot structural motif. J. Mol. Biol. 294: 1327–1336.
    OpenUrlCrossRefPubMed
  16. ↵
    1. Craik D.J.,
    2. Daly N.L.,
    3. Waine C.
    (2001). The cystine knot motif in toxins and implications for drug design. Toxicon 39: 43–60.
    OpenUrlCrossRefPubMed
  17. ↵
    1. Craik D.J.,
    2. Mylne J.S.,
    3. Daly N.L.
    (2010). Cyclotides: Macrocyclic peptides with applications in drug design and agriculture. Cell. Mol. Life Sci. 67: 9–16.
    OpenUrlCrossRefPubMed
  18. ↵
    1. Daly N.L.,
    2. Clark R.J.,
    3. Plan M.R.,
    4. Craik D.J.
    (2006). Kalata B8, a novel antiviral circular protein, exhibits conformational flexibility in the cystine knot motif. Biochem. J. 393: 619–626.
    OpenUrlCrossRefPubMed
  19. ↵
    1. Do C.B.,
    2. Woods D.A.,
    3. Batzoglou S.
    (2006). CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 22: e90–e98.
    OpenUrlAbstract/FREE Full Text
  20. ↵
    1. Dutton J.L.,
    2. Renda R.F.,
    3. Waine C.,
    4. Clark R.J.,
    5. Daly N.L.,
    6. Jennings C.V.,
    7. Anderson M.A.,
    8. Craik D.J.
    (2004). Conserved structural and sequence elements implicated in the processing of gene-encoded circular proteins. J. Biol. Chem. 279: 46858–46867.
    OpenUrlAbstract/FREE Full Text
  21. ↵
    1. Felizmenio-Quimio M.E.,
    2. Daly N.L.,
    3. Craik D.J.
    (2001). Circular proteins in plants: Solution structure of a novel macrocyclic trypsin inhibitor from Momordica cochinchinensis. J. Biol. Chem. 276: 22875–22882.
    OpenUrlAbstract/FREE Full Text
  22. ↵
    1. Felsenstein J.
    (1973). Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst. Zool. 22: 240–249.
    OpenUrlAbstract/FREE Full Text
  23. ↵
    1. García-Lorenzo M.,
    2. Sjödin A.,
    3. Jansson S.,
    4. Funk C.
    (2006). Protease gene families in Populus and Arabidopsis. BMC Plant Biol. 6: 30.
    OpenUrlCrossRefPubMed
  24. ↵
    1. Gillon A.D.,
    2. Saska I.,
    3. Jennings C.V.,
    4. Guarino R.F.,
    5. Craik D.J.,
    6. Anderson M.A.
    (2008). Biosynthesis of circular proteins in plants. Plant J. 53: 505–515.
    OpenUrlCrossRefPubMed
  25. ↵
    1. Göransson U.,
    2. Luijendijk T.,
    3. Johansson S.,
    4. Bohlin L.,
    5. Claeson P.
    (1999). Seven novel macrocyclic polypeptides from Viola arvensis. J. Nat. Prod. 62: 283–286.
    OpenUrlCrossRefPubMed
  26. ↵
    1. Gran L.
    (1970). An oxytocic principle found in Oldenlandia affinis DC. Medd. Nor. Farm. Selsk. 12: 173–180.
    OpenUrl
  27. ↵
    1. Greenwood K.P.,
    2. Daly N.L.,
    3. Brown D.L.,
    4. Stow J.L.,
    5. Craik D.J.
    (2007). The cyclic cystine knot miniprotein MCoTI-II is internalized into cells by macropinocytosis. Int. J. Biochem. Cell Biol. 39: 2252–2264.
    OpenUrlCrossRefPubMed
  28. ↵
    1. Gruis D.,
    2. Schulze J.,
    3. Jung R.
    (2004). Storage protein accumulation in the absence of the vacuolar processing enzyme family of cysteine proteases. Plant Cell 16: 270–290.
    OpenUrlAbstract/FREE Full Text
  29. ↵
    1. Hara-Nishimura I.,
    2. Inoue K.,
    3. Nishimura M.
    (1991). A unique vacuolar processing enzyme responsible for conversion of several proprotein precursors into the mature forms. FEBS Lett. 294: 89–93.
    OpenUrlCrossRefPubMed
  30. ↵
    1. Hara-Hishimura I.,
    2. Takeuchi Y.,
    3. Inoue K.,
    4. Nishimura M.
    (1993). Vesicle transport and processing of the precursor to 2S albumin in pumpkin. Plant J. 4: 793–800.
    OpenUrlCrossRefPubMed
  31. ↵
    1. Heitz A.,
    2. Hernandez J.F.,
    3. Gagnon J.,
    4. Hong T.T.,
    5. Pham T.T.,
    6. Nguyen T.M.,
    7. Le-Nguyen D.,
    8. Chiche L.
    (2001). Solution structure of the squash trypsin inhibitor MCoTI-II. A new family for cyclic knottins. Biochemistry 40: 7973–7983.
    OpenUrlCrossRefPubMed
  32. ↵
    1. Hernandez J.F.,
    2. Gagnon J.,
    3. Chiche L.,
    4. Nguyen T.M.,
    5. Andrieu J.P.,
    6. Heitz A.,
    7. Trinh Hong T.,
    8. Pham T.T.,
    9. Le Nguyen D.
    (2000). Squash trypsin inhibitors from Momordica cochinchinensis exhibit an atypical macrocyclic structure. Biochemistry 39: 5722–5730.
    OpenUrlCrossRefPubMed
  33. ↵
    1. Hiraiwa N.,
    2. Nishimura M.,
    3. Hara-Nishimura I.
    (1999). Vacuolar processing enzyme is self-catalytically activated by sequential removal of the C-terminal and N-terminal propeptides. FEBS Lett. 447: 213–216.
    OpenUrlCrossRefPubMed
  34. ↵
    1. Hofacker I.L.,
    2. Fontana W.,
    3. Stadler P.F.,
    4. Bonhoeffer L.S.,
    5. Tacker M.,
    6. Schuster P.
    (1994). Fast folding and comparison of RNA secondary structures. Monatsh. Chem. 125: 167–188.
    OpenUrlCrossRef
  35. ↵
    1. Hutchinson E.G.,
    2. Thornton J.M.
    (1996). PROMOTIF–a program to identify and analyze structural motifs in proteins. Protein Sci 5: 212–220.
    OpenUrlCrossRefPubMed
  36. ↵
    1. Ikeya T.,
    2. Terauchi T.,
    3. Güntert P.,
    4. Kainosho M.
    (2006). Evaluation of stereo-array isotope labeling (SAIL) patterns for automated structural analysis of proteins with CYANA. Magn. Reson. Chem. 44: S152–S157.
    OpenUrl
  37. ↵
    1. Jennings C.,
    2. West J.,
    3. Waine C.,
    4. Craik D.,
    5. Anderson M.
    (2001). Biosynthesis and insecticidal properties of plant cyclotides: The cyclic knotted proteins from Oldenlandia affinis. Proc. Natl. Acad. Sci. USA 98: 10614–10619.
    OpenUrlAbstract/FREE Full Text
  38. ↵
    1. Kuroyanagi M.,
    2. Yamada K.,
    3. Hatsugai N.,
    4. Kondo M.,
    5. Nishimura M.,
    6. Hara-Nishimura I.
    (2005). Vacuolar processing enzyme is essential for mycotoxin-induced cell death in Arabidopsis thaliana. J. Biol. Chem. 280: 32914–32920.
    OpenUrlAbstract/FREE Full Text
  39. ↵
    1. Laskowski R.A.,
    2. Rullmannn J.A.,
    3. MacArthur M.W.,
    4. Kaptein R.,
    5. Thornton J.M.
    (1996). AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR. 8: 477–486.
    OpenUrlCrossRefPubMed
  40. ↵
    1. Ling M.H.,
    2. Qi H.Y.,
    3. Chi C.W.
    (1993). Protein, cDNA, and genomic DNA sequences of the towel gourd trypsin inhibitor. A squash family inhibitor. J. Biol. Chem. 268: 810–814.
    OpenUrlAbstract/FREE Full Text
  41. ↵
    1. Luckett S.,
    2. Garcia R.S.,
    3. Barker J.J.,
    4. Konarev A.V.,
    5. Shewry P.R.,
    6. Clarke A.R.,
    7. Brady R.L.
    (1999). High-resolution structure of a potent, cyclic proteinase inhibitor from sunflower seeds. J. Mol. Biol. 290: 525–533.
    OpenUrlCrossRefPubMed
  42. ↵
    1. Marcotte E.M.,
    2. Pellegrini M.,
    3. Yeates T.O.,
    4. Eisenberg D.
    (1999). A census of protein repeats. J. Mol. Biol. 293: 151–160.
    OpenUrlCrossRefPubMed
  43. ↵
    1. Min W.,
    2. Jones D.H.
    (1994). In vitro splicing of concanavalin A is catalyzed by asparaginyl endopeptidase. Nat. Struct. Biol. 1: 502–504.
    OpenUrlCrossRefPubMed
  44. ↵
    1. Mylne J.S.,
    2. Colgrave M.L.,
    3. Daly N.L.,
    4. Chanson A.H.,
    5. Elliott A.G.,
    6. McCallum E.J.,
    7. Jones A.,
    8. Craik D.J.
    (2011). Albumins and their processing machinery are hijacked for cyclic peptides in sunflower. Nat. Chem. Biol. 7: 257–259.
    OpenUrlCrossRefPubMed
  45. ↵
    1. Nguyen G.K.,
    2. Zhang S.,
    3. Nguyen N.T.,
    4. Nguyen P.Q.,
    5. Chiu M.S.,
    6. Hardjojo A.,
    7. Tam J.P.
    (2011). Discovery and characterization of novel cyclotides originated from chimeric precursors consisting of albumin-1 chain a and cyclotide domains in the Fabaceae family. J. Biol. Chem. 286: 24275–24287.
    OpenUrlAbstract/FREE Full Text
  46. ↵
    1. Ogata H.,
    2. Audic S.,
    3. Barbe V.,
    4. Artiguenave F.,
    5. Fournier P.-E.,
    6. Raoult D.,
    7. Claverie J.-M.
    (2000). Selfish DNA in protein-coding genes of Rickettsia. Science 290: 347–350.
    OpenUrlAbstract/FREE Full Text
  47. ↵
    1. Parmenter D.L.,
    2. Boothe J.G.,
    3. van Rooijen G.J.,
    4. Yeung E.C.,
    5. Moloney M.M.
    (1995). Production of biologically active hirudin in plant seeds using oleosin partitioning. Plant Mol. Biol. 29: 1167–1180.
    OpenUrlCrossRefPubMed
  48. ↵
    1. Patterson C.
    (1982). Morphological characters and homology. In Problems of Phylogenetic Reconstruction, K. Joysey and A. Friday, eds (London: Academic Press), pp. 21–74.
  49. ↵
    1. Posada D.
    (2008). jModelTest: Phylogenetic model averaging. Mol. Biol. Evol. 25: 1253–1256.
    OpenUrlAbstract/FREE Full Text
  50. ↵
    1. Poth A.G.,
    2. Colgrave M.L.,
    3. Lyons R.E.,
    4. Daly N.L.,
    5. Craik D.J.
    (2011). Discovery of an unusual biosynthetic origin for circular proteins in legumes. Proc. Natl. Acad. Sci. USA 108: 10127–10132.
    OpenUrlAbstract/FREE Full Text
  51. ↵
    1. Poth A.G.,
    2. Mylne J.S.,
    3. Grassl J.,
    4. Lyons R.E.,
    5. Millar A.H.,
    6. Colgrave M.L.,
    7. Craik D.J.
    (June 14, 2012). Cyclotides associate with leaf vasculature and are the products of a novel precursor in Petunia (Solanaceae). J. Biol. Chem. http://dx.doi.org/10.1074/jbc.M112.370841.
  52. ↵
    1. Saska I.,
    2. Gillon A.D.,
    3. Hatsugai N.,
    4. Dietzgen R.G.,
    5. Hara-Nishimura I.,
    6. Anderson M.A.,
    7. Craik D.J.
    (2007). An asparaginyl endopeptidase mediates in vivo protein backbone cyclization. J. Biol. Chem. 282: 29721–29728.
    OpenUrlAbstract/FREE Full Text
  53. ↵
    1. Schaefer H.,
    2. Renner S.S.
    (2010). A three-genome phylogeny of Momordica (Cucurbitaceae) suggests seven returns from dioecy to monoecy and recent long-distance dispersal to Asia. Mol. Phylogenet. Evol. 54: 553–560.
    OpenUrlCrossRefPubMed
  54. ↵
    1. Schilling S.,
    2. Wasternack C.,
    3. Demuth H.-U.
    (2008). Glutaminyl cyclases from animals and plants: A case of functionally convergent protein evolution. Biol. Chem. 389: 983–991.
    OpenUrlCrossRefPubMed
  55. ↵
    1. Schmidt E.E.,
    2. Davies C.J.
    (2007). The origins of polypeptide domains. Bioessays 29: 262–270.
    OpenUrlCrossRefPubMed
  56. ↵
    1. Shimada T.,
    2. et al
    . (2003). Vacuolar processing enzymes are essential for proper processing of seed storage proteins in Arabidopsis thaliana. J. Biol. Chem. 278: 32292–32299.
    OpenUrlAbstract/FREE Full Text
  57. ↵
    1. Stamatakis A.,
    2. Hoover P.,
    3. Rougemont J.
    (2008). A rapid bootstrap algorithm for the RAxML Web servers. Syst. Biol. 57: 758–771.
    OpenUrlAbstract/FREE Full Text
  58. ↵
    1. Thongyoo P.,
    2. Bonomelli C.,
    3. Leatherbarrow R.J.,
    4. Tate E.W.
    (2009). Potent inhibitors of β-tryptase and human leukocyte elastase based on the MCoTI-II scaffold. J. Med. Chem. 52: 6197–6200.
    OpenUrlCrossRefPubMed
  59. ↵
    1. Thongyoo P.,
    2. Jaulent A.M.,
    3. Tate E.W.,
    4. Leatherbarrow R.J.
    (2007). Immobilized protease-assisted synthesis of engineered cysteine-knot microproteins. ChemBioChem 8: 1107–1109.
    OpenUrlCrossRefPubMed
  60. ↵
    1. Yoon H.-S.,
    2. Baum D.A.
    (2004). Transgenic study of parallelism in plant morphological evolution. Proc. Natl. Acad. Sci. USA 101: 6524–6529.
    OpenUrlAbstract/FREE Full Text
View Abstract
PreviousNext
Back to top

Table of Contents

Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Plant Cell.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Cyclic Peptides Arising by Evolutionary Parallelism via Asparaginyl-Endopeptidase–Mediated Biosynthesis
(Your Name) has sent you a message from Plant Cell
(Your Name) thought you would like to see the Plant Cell web site.
Citation Tools
Cyclic Peptides Arising by Evolutionary Parallelism via Asparaginyl-Endopeptidase–Mediated Biosynthesis
Joshua S. Mylne, Lai Yue Chan, Aurelie H. Chanson, Norelle L. Daly, Hanno Schaefer, Timothy L. Bailey, Philip Nguyencong, Laura Cascales, David J. Craik
The Plant Cell Jul 2012, 24 (7) 2765-2778; DOI: 10.1105/tpc.112.099085

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Cyclic Peptides Arising by Evolutionary Parallelism via Asparaginyl-Endopeptidase–Mediated Biosynthesis
Joshua S. Mylne, Lai Yue Chan, Aurelie H. Chanson, Norelle L. Daly, Hanno Schaefer, Timothy L. Bailey, Philip Nguyencong, Laura Cascales, David J. Craik
The Plant Cell Jul 2012, 24 (7) 2765-2778; DOI: 10.1105/tpc.112.099085
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • INTRODUCTION
    • RESULTS
    • DISCUSSION
    • METHODS
    • Acknowledgments
    • AUTHOR CONTRIBUTIONS
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • PDF

In this issue

The Plant Cell Online: 24 (7)
The Plant Cell
Vol. 24, Issue 7
Jul 2012
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Advertising (PDF)
  • Front Matter (PDF)
View this article with LENS

More in this TOC Section

  • Origin and Evolution of Diploid and Allopolyploid Camelina Genomes Were Accompanied by Chromosome Shattering
  • Metabolically Distinct Pools of Phosphatidylcholine Are Involved in Trafficking of Fatty Acids out of and into the Chloroplast for Membrane Production
  • The Formation of a Camalexin Biosynthetic Metabolon
Show more RESEARCH ARTICLES

Similar Articles

Our Content

  • Home
  • Current Issue
  • Plant Cell Preview
  • Archive
  • Teaching Tools in Plant Biology
  • Plant Physiology
  • Plant Direct
  • Plantae
  • ASPB

For Authors

  • Instructions
  • Submit a Manuscript
  • Editorial Board and Staff
  • Policies
  • Recognizing our Authors

For Reviewers

  • Instructions
  • Peer Review Reports
  • Journal Miles
  • Transfer of reviews to Plant Direct
  • Policies

Other Services

  • Permissions
  • Librarian resources
  • Advertise in our journals
  • Alerts
  • RSS Feeds
  • Contact Us

Copyright © 2019 by The American Society of Plant Biologists

Powered by HighWire