|
|
||||||||
|
American Society of Plant Biologists
Deductions about the Number, Organization, and Evolution of Genes in the Tomato Genome Based on Analysis of a Large Expressed Sequence Tag Collection and Selective Genomic Sequencing
a Department of Plant Breeding and Department of Plant Biology, Cornell University, Ithaca, New York 14850 1 To whom correspondence should be addressed. E-mail sdt4{at}cornell.edu; fax 607-255-6683
Analysis of a collection of 120,892 single-pass ESTs, derived from 26 different tomato cDNA libraries and reduced to a set of 27,274 unique consensus sequences (unigenes), revealed that 70% of the unigenes have identifiable homologs in the Arabidopsis genome. Genes corresponding to metabolism have remained most conserved between these two genomes, whereas genes encoding transcription factors are among the fastest evolving. The majority of the 10 largest conserved multigene families share similar copy numbers in tomato and Arabidopsis, suggesting that the multiplicity of these families may have occurred before the divergence of these two species. An exception to this multigene conservation was observed for the E8-like protein family, which is associated with fruit ripening and has higher copy number in tomato than in Arabidopsis. Finally, six BAC clones from different parts of the tomato genome were isolated, genetically mapped, sequenced, and annotated. The combined analysis of the EST database and these six sequenced BACs leads to the prediction that the tomato genome encodes 35,000 genes, which are sequestered largely in euchromatic regions corresponding to less than one-quarter of the total DNA in the tomato nucleus.
Currently, the only plant genome to have been sequenced fully is that of Arabidopsisa major milestone for plant biology. The availability of this sequence provides us with a detailed view of the gene content and genome organization of one plant species. Yet, the degree to which gene content, gene number, and genome organization are conserved among plant species remains unresolved. To answer these questions and to allow us to begin to understand the forces that have shaped plant genome evolution will require the sequencing of multiple plant genomes. Because of the relatively large size of most plant genomes and the associated high cost of sequencing, it is unlikely that we will have the full genomic sequence for many plant species in the near future.
A less expensive alternative is to sequence or partially sequence cDNA clones, which can reveal a substantial portion of the expressed genes of a genome at a fraction of the cost of genomic sequencing. As a result, extensive EST efforts are under way in a wide variety of plant species (National Science Foundation Plant Genome Research Program [http://www.nsf.gov/bio/dbi/dbi_pgr.htm]; Pennisi, 1998 Solanaceae, the nightshade family, is the third most valuable crop family in the United States, exceeded only by the grasses and the legumes, and is the most valuable family in terms of vegetable crops. In addition to its economic value, the family is unique with respect to the number of species that have been domesticated and the wide variety of uses to which they have been put. Solanaceous species have been domesticated for edible fruit (tomato, eggplant, pepper, tomatillo, and tamarindo), leafy vegetables (S. macrocarpon in Africa), tubers (potato), secondary compounds (tobacco), and ornamental flowers (petunia, Nicotiana spp). Tomato is the centerpiece for genetic and molecular research for the Solanaceae, attributable in part to inherent features of the species, including diploidy, modestly sized genome (950 Mb), tolerance of inbreeding, amenability to genetic transformation, and the availability of well-characterized genetic resources. Through a National Science Foundationfunded project, we have generated a database for tomato comprising >120,000 ESTs (http://sgn.cornell.edu/; http://www.tigr.org/tdb/lgi). In addition, BAC clones corresponding to six selected regions of the tomato genome were sequenced. In this report, we describe the analysis of both the tomato EST database and the BAC sequences. Computational comparisons are made against the Arabidopsis genomic sequence and a similar high-density EST database from another dicot species, Medicago truncatula (http://www.tigr.org/tdb/mtgi/). As a result of these analyses, we have been able to address a number of issues, including the content, number, and organization of genes in the tomato genome and the degree to which genes have diverged since tomato, Arabidopsis, and M. truncatula diverged from their last common ancestor.
Contig Assembly of ESTs and Establishment of a Tomato Unigene Set EST data sets of randomly sequenced cDNA libraries are redundant for many gene transcripts. This redundancy approximately represents gene transcript levels in the tissues that were used for library construction and can be used to assemble ESTs into contiguous overlapping clusters, with each cluster potentially representing a single unique gene. A substantial number of the low-frequency transcripts occur as single ESTs (singletons) and hence are not incorporated into contig assemblies. The combined set of contigs and singletons is referred to as a unigene set. This unigene set is believed to represent the minimal gene content for a species, with the caveat that in certain instances multiple unigenes could represent a single gene transcript, for example, as a result of nonoverlapping EST sequences.
In this study, a high stringency for matching was applied in the clustering to ensure a high level of confidence that each sequence in the unigene set represents a unique gene transcript. The specifications for clustering and unigene construction were as described in Quackenbush et al. (2000) From each library, between 2000 and 10,000 directional clones were sequenced from the 5' end; in addition, 5998 3' end sequences from a flower tissue library were included. Details of the individual libraries are available through the Solanaceae Genome Network. The data set of 120,892 ESTs was reduced to 27,274 unigenes, comprising approximately equal numbers of contigs (also referred to as "tentative consensus" sequences) and singletons (Table 1). The contig sequence length ranged from 107 to 3285 bp, with an average of 823 bp, and the singleton length ranged from 101 to 823 bp, with an average of 447 bp (Figure 1) .
Functional Annotation of the Unigene Set Annotation of the EST-derived unigene set was approached in two ways. First, a surrogate annotation approach was applied in which the unigene set was annotated on the basis of the existing annotation available for the proteome of Arabidopsis. BLASTX was used to screen the entire tomato unigene set against the subset of the Arabidopsis proteome to which functional categories have been assigned (Arabidopsis Genome Initiative; 2000; http://wwww.Arabidopsis.org). Tomato unigenes with an expect value (E-value) of <1.0 E-10 were assigned to the corresponding Arabidopsis annotation. In doing so, the assumption was made that functionality is transferable based on sequence conservation, to which there are many exceptions. Annotation followed the Munich Information Center for Protein Sequences (http://mips.gsf.de) role categorization.
A total of 65% of the unigenes did not have a significant Arabidopsis match at this threshold and thus were considered "unclassified." Another 5% had Arabidopsis matches, but their matching genes were listed as "classification unclear." Thus, only 30% of the 27,274 tomato unigenes were assigned a putative function using this method (Figure 2)
. As a control, a subset of
This set of genes, referred to as conserved ortholog set markers, was annotated based on matches against the GenBank protein database (http://www.ncbi.nlm.nih.gov/Database/index.html). The assignment of functional categories was very similar for the entire unigene set and for the conserved ortholog set markers that were annotated manually, providing support for the surrogate annotation approach used for the entire unigene set (data not shown).
The largest proportion of functionally assigned unigenes fell into four role categories (r.c.): metabolism (r.c. 1), transcription (r.c. 4), cellular organization (r.c. 30), and cellular communication/signal transduction (r.c. 10). Together, these classes accounted for more than half of the assignable unigenes (Figure 2). These categories also are the largest for the Arabidopsis proteome and may represent a general tendency for all plant species (Arabidopsis Genome Initiative, 2000
Comparing Tomato Gene Content with That of Other Plant Species
With these issues in mind, we computationally compared the tomato unigene set with the gene repertoire of Arabidopsis and M. truncatula. Tomato, Arabidopsis, and M. truncatula each belongs to a different plant family (Solanaceae, Brassicaceae, and Leguminoseae, respectively). However, M. truncatula and Arabidopsis are much more closely related to each other than to tomato, which diverged from Arabidopsis and M. truncatula as much as 150 million years ago, early in the period of dicot diversification (Yang et al., 1999
Tomato-Arabidopsis Comparisons Nonetheless, because the E-values are summarized over a large number of sequence comparisons and used primarily to reveal general trends in the conservation of sequence and functionality, such drawbacks are unlikely to affect the overall conclusions. The analysis was made directly against the Arabidopsis genomic sequence (rather than the predicted proteome), so that genes previously unidentified in the Arabidopsis genome also could be detected via homology with tomato ESTs. Figure 3 displays the distribution of E-value matches in conjunction with functional role categories for the tomato unigene set. Nearly 70% of the tomato unigenes have significant matches at the amino acid level (E-value < 1.0 E-5) to one or more translated portions of the Arabidopsis genomic sequence. The majority (52%) of the tomato unigenes with matches to the Arabidopsis genome hit Arabidopsis genes for which no putative functions have been assigned. The highest proportion of these fell into categories that showed the weakest homology with their Arabidopsis counterparts. For example, for those unigenes that had weak homology with their Arabidopsis counterparts, 80% matched unclassified Arabidopsis genes (Figure 3). In contrast, for unigenes that had high homology (tBLASTX E-value < 1.0 E-100) with their Arabidopsis counterparts, only 20% matched unclassified Arabidopsis genes (Figure 3).
To further analyze the nature of both fast- and slow-evolving genes identified by the tomato-Arabidopsis comparisons, we simultaneously examined more closely the putative functional role and the degree of sequence similarity (to its closest Arabidopsis counterpart) of each tomato unigene (Figure 3). The goal of this exercise was to determine whether certain functional classes of genes have evolved more rapidly since tomato and Arabidopsis diverged from their last common ancestor. Such information might provide clues to which types of genes/gene functions are more constant across plant taxa (more ancestral gene functions) and which genes/gene functions tend to evolve rapidly as species evolve (more derived gene functions). A summary of this analysis is presented below.
Of the >27,000 tomato unigenes, 22% show very high conservation (E-value < 1.0 E-50) with Arabidopsis genes (Figure 3). Within this "slow-evolving" category, by far the highest proportion (24%) belonged to the metabolism category (r.c. 1; Figure 3). The proportion of genes assigned to this category decreased to 19 and 12%, respectively, as one moved to the "intermediate-evolving" (E < 1.0 E-15 to E
Identification of Tomato-Specific Genes
A large proportion of these 114 sequences (75%) revealed perfect matches with Escherichia coli DNA, bacteriophage Only 28 of the original 114 unigenes had no detectable counterpart in the Arabidopsis genome but matched genes from other species (as present in the GenBank protein database) (Table 2). Eleven of these unigenes (39%) corresponded to three gene families that appear to be specific to Solanaceae, having matches with other solanaceous plants but not with other plant families. These three Solanaceae-unique gene families are type II proteinase inhibitors (TC67527; six unigenes assigned), fruit-specific proteins/metallocarboxypeptidase inhibitors (TC63650; two unigenes assigned), and extensin-like proteins (TC63390; three unigenes assigned).
These three gene families appear to be specific to the Solanaceae; however, we cannot determine from these data whether these genes were lost in the Arabidopsis lineage or subjected to accelerated evolution (hence, their uniqueness to the Solanaceae lineage). Proteinase inhibitors (type II and metallocarboxypeptidase inhibitor) generally are known to be involved in resistance against herbivory (Johnson et al., 1989
The other 17 tomato unigenes (not found in Arabidopsis but found in other species) had matches not only in solanaceous species (a member of the asterid clade) but also in species belonging to the rosid clade, to which Arabidopsis belongs (Chase et al., 1993
Polyphenoloxidases (TC58703) generally are known to be involved in resistance against herbivory and are found in many plant species, including many rosids (such as apple and bean); therefore, they appear to have been lost specifically from the Arabidopsis genome or ancestral lineage (Cary et al., 1992
Ornithine decarboxylase (TC67742) catalyzes the second step in putrescine biosynthesis from the amino acid Arg. Putrescine is a precursor for the biosynthesis of the polyamines spermine and spermidine, which are essential for the growth and development of plants. Interestingly, the pathway from L-Arg to putrescine involving ornithine decarboxylase is redundant in many plants with a pathway involving Arg decarboxylase (Kumar et al., 1997
TC59945, TGSAY39TH, and TC69096 are tomato unigenes with a high degree of similarity to genes from species belonging to the rosid clade but lack matches in the Arabidopsis genome. They share high similarity with pathogenesis-related genes identified previously in potato and tomato (pSTH2 and TS-1; Matton and Brisson, 1989 TOVCB02THB shares considerable sequence similarity with a range of transcription factors. Matches are found with tobacco and maize and several nonplant eukaryotic genes, including mammalian species and Drosophila. This may represent a case in which genes were lost in the Arabidopsis lineage. TC59463 and TC66179 do not display matches with species in the rosid clade but have matches outside of the plant kingdom, which suggests a more complicated ancestry. TC59463 is highly similar to a pararetroviral sequence integrated into the tobacco genome. Integration of this type of pararetrovirus may be specific to members of the Solanaceae simply because of the host range for these viruses. TC66179 appears to be highly conserved in a rice gene and has some weak sequence similarity with a gene from human (E-value = 5 E-5), but no other matches with plant species were identified.
Evolutionary Comparison of Tomato Unigenes with Those in Arabidopsis and M. truncatula A definitive answer to this question awaits the full genomic sequencing of a wide variety of species throughout the evolutionary tree of plants. However, in an attempt to elucidate this issue, we wondered if genes that are highly conserved between tomato and Arabidopsis also are highly conserved in other plants and whether genes that have diverged rapidly since tomato and Arabidopsis diverged from their last common ancestor are likely to have evolved rapidly in other plant lineages. To attempt to address these questions, we computationally compared the tomato unigene set not only with that of Arabidopsis but also with the comprehensive EST data set now available for M. truncatula, which belongs to a third dicot family, Leguminoseae (http://www.tigr.org/tdb/mtgi). The entire tomato unigene set was compared with the entire M. truncatula gene index (http://www.tigr.org) at the amino acid level using tBLASTX. Figure 4 depicts these results in a manner whereby the similarity of tomatoM. truncatula matches can be compared with the similarity of tomato-Arabidopsis matches.
Approximately 90% of the genes that were most conserved between tomato and Arabidopsis (tBLASTX E-value < 1.0 E-100) have a highly significant detectable counterpart in M. truncatula with a tBLASTX E-value threshold of <1.00 E-20 (Figure 4). As one moves down to sets of genes that are less well conserved between Arabidopsis and tomato, the proportion with significant matches to M. truncatula genes decreases dramatically (Figure 4).
These results strongly suggest that genes well conserved between a given pair of plant species also are likely to be well conserved in other plants species, presumably as a result of strong negative selection pressure to maintain essential, and hence less mutable, gene functions (e.g., basic metabolism; see above). These results also support the notion that highly conserved orthologs detected in pairwise species comparisons will have similar conserved orthologs in other plant genomes, a finding that has implications for both comparative gene mapping and molecular phylogenetic studies in plants (Fulton et al., 2002 Finally, it is worth noting that virtually no tomato unigenes displayed a match with M. truncatula but not with Arabidopsis. The only exception was TC59945, which matched a pathogenesis-related gene isolated from potato (pSTH2). Good matches for this particular protein also are found in other members of the rosid clade, such as Prunus species, cowpea, and birch. However, as mentioned above, this gene was not detected in the Arabidopsis genome; hence, it may have been lost relatively recently in the Arabidopsis lineage. However, we cannot dismiss the possibility that this gene exists in Arabidopsis in one of the gene-poor centromeric regions that have not been sequenced fully.
Characteristics of Tomato Multigene Families To have comparable results for comparisons, the Arabidopsis gene set (available from TAIR at http://www.Arabidopsis.org) was subjected to the same analysis using the same threshold. Figure 5 presents the distribution of unigenes into gene families of various copy numbers for both the tomato and Arabidopsis genomes. It is important to note that, because the unigene set does not represent all tomato genes (especially those with low expression levels), the copy numbers of tomato unigenes may be underestimated, which would account also for the large excess of singletons in the tomato gene copy number distribution.
Comparison of Multigene Families between Tomato and Arabidopsis To address the question of whether gene copy number in tomato is correlated with the degree of conservation with other species, the gene copy numbers for tomato unigenes were plotted against the conservation (as determined by tBLASTX) with their Arabidopsis counterparts. In other words, are conserved genes more or less likely to be multicopy than less conserved genes?
As described above, genes were assigned a gene copy number based on matches with an E-value threshold of <1.0 E-20 using tBLASTX. Of the 27,274 tomato unigenes,
A second question that can be addressed with these data is whether the copy number of highly conserved genes is conserved between tomato and Arabidopsis. In an attempt to answer this question, the copy number of each tomato gene family (as determined by tBLASTX E-values of <1.00 E-30) was plotted against the copy number of the corresponding Arabidopsis gene family. The results demonstrate a significant correlation (r = 0.49) between the copy numbers of Arabidopsis and tomato multigene families. The results from this analysis are depicted in the form of a histogram in Figure 7 .
To illustrate these observations, the genes with the 10 highest copy numbers for both the tomato unigene set and Arabidopsis are listed in Tables 3 and 4. Protein kinases and cytochrome P450s represent the largest families in tomato and represent very large families in Arabidopsis as well. Other large families in common are genes with similarity to Myb-like transcription factors and NAC domaincontaining proteins (Souer et al., 1996
The only group of unigenes, represented by TC58771, that seems to have a higher copy number in tomato than in Arabidopsis (39 versus 20 copies) is a group composed of E8-like genes (23 copies with E8-like protein as best match) and smaller numbers of different oxidoreductases such as dioxygenases (five), hydroxylases (three), flavonal synthases (three), and 1-aminocyclopropane-1-carboxylate oxidases (five). Although the function of E8-like proteins is unknown, they share extensive similarity with numerous oxidoreductases and are known to be expressed specifically during fruit ripening (Deikman and Fischer, 1988
Analysis of Gene Content and Organization Based on Genomic Sequencing
Variation of Gene Density across BACs The predicted gene density of the various BACs varied from 5 kb/gene to 17 kb/gene, a threefold difference (Table 5). For the two BACs with the lowest gene density (2O7 and 47I13), 12 putative genes were identified, only 1 of which had an exact match in the EST unigene set. By contrast, for the other four BACs with higher predicted gene densities, nearly half of the putative genes had corresponding unigene matches. Besides being low in gene density, 2O7 and 47I13 were the only BACs that contained transposon-type sequences, mostly reverse transcriptase sequences similar to copia- and gypsy-like retrotransposons.
Gene Number Estimate Based on BAC Sequences and the EST Database
This is almost certainly a gross overestimate of the gene density (9.8 kb/gene; see below), being much larger than the gene content of any of the recently completely sequenced eukaryotic genomes, all of which contain <40,000 protein-coding sequences (Arabidopsis Genome Initiative, 2000
A more accurate estimate of the total gene content of tomato can be made by comparing the size of the EST-derived unigene set and the percentage of predicted genes in genomic DNA (e.g., BAC sequences) that are represented by a unigene match. As described above, the current EST-derived unigene set is composed of
Hence, the actual number of unique genes represented by an EST-derived unigene set usually is less than the number of unigenes. For example, in Arabidopsis, the EST-derived unigene set leads to a 35% overestimate of the actual number of genes ultimately revealed by genomic sequencing (Arabidopsis Genome Initiative, 2000
How well gene number, gene organization, and gene function in Arabidopsis will predict those of other plant species is unknown at present. In this respect, tomato is a useful species for comparison. It belongs to a plant family (Solanaceae) that diverged from the lineage leading to Arabidopsis as much as 150 million years agoearly in the radiation of dicots. Thus, by identifying features conserved between the tomato and Arabidopsis genomes, one would expect to identify gene/genome features that might be conserved in other plants. However, because of the long period of divergence between these species, one might expect to reveal trends in gene and genome divergence, which also can be instructive about plant genome evolution. For these reasons, we analyzed and annotated a large EST data set and genomic sequences of BAC clones and used the results to make deductions about the composition and organization of the tomato genome. These analyses were made using Arabidopsis, and to a lesser degree M. truncatula, as points of comparison.
Characteristics of the Tomato Gene Repertoire Examination of the tomato gene content also provides evidence for selective gene loss in the Arabidopsis lineage. For example, polyphenoloxidases and ornithine decarboxylase are found in tomato as well as many other plant taxa but not in Arabidopsis. Thus, Arabidopsis probably has lost some gene functions still retained in other plant species and hence is not a ready model for the exploration of such functions.
As a lower limit, we estimate that at least 50% of the tomato genes belong to multigene families. This estimate is based on the observation that approximately half of the EST-derived tomato unigenes have significant matches to one or more other unigenes. However, the actual proportion of tomato genes that belong to multigene families probably is larger than this because the unigene set is estimated to represent no more than half of the tomato genes. It is worth noting that previous studies based on probing of random cDNAs on genomic DNA gel blots led to the estimate that 47% of tomato genes belong to multigene families (Bernatzky and Tanksley, 1986
Because DNA gel blot hybridization cannot readily detect genes with >30% nucleotide divergence, such estimates will be inherently less than those derived from computational comparisons that do not have this restriction. Overall, however, the percentage of genes that belong to multigene families in tomato does not appear to be significantly higher than that in Arabidopsis (65%). This observation lends support to the hypothesis that the evolutionary lineage leading to tomato did not experience any recent whole-genome duplication events (e.g., polyploidy). Rather, any whole-genome duplications occurred in the distant past, near the time that tomato and Arabidopsis diverged from their last common ancestor (Ku et al., 2000 The copy numbers of specific multigene families are correlated significantly between tomato and Arabidopsis, which may be a result of the duplication/diversification of these families before the divergence of the tomato and Arabidopsis lineages. Alternatively, selection pressure may have been exerted independently in each lineage to maintain a relatively stable copy number, even in the face of continuing duplication and deletion. An exception was found for the E8 gene family, whose functions are not well elucidated but that often is associated with tomato fruit development/ripening. In this instance, the E8 gene family is larger in tomato than in Arabidopsis and may reflect a more complex fruit development/ripening process in tomato compared with Arabidopsis.
Characteristics of Plant Gene Evolution as Deduced from Comparisons of Tomato, Arabidopsis, and M. truncatula
Although genes encoding metabolism appear to evolve more slowly, genes encoding transcription factors appear to diverge more rapidly among species. The transcription factor category nearly doubles in frequency as one moves from the slow-evolving to the fast-evolving category (Figure 3). This result suggests that changes in gene regulation (through the accelerated evolution of transcription factors) have been a significant force in plant evolution. The sequencing of the Arabidopsis genome revealed that plants have developed a number of transcription factor gene families that are unique to plants and not present in other eukaryotic lineages and that plants contain a significantly greater proportion of transcription factors that exhibit rapid evolution rates in regions outside the core conserved domains (Arabidopsis Genome Initiative, 2000
The finding that transcription factors, as a group, evolve more rapidly than other classified groups of genes also is consistent with the idea that the morphological evolution of species is highly dependent on changes in the regulatory patterns of gene expression (King and Wilson, 1975
Tomato Genes Probably Are Concentrated in Euchromatin, Which Represents the Minority of DNA in the Tomato Genome
The fact that the gene densities on all of these BACs are much higher than necessary to account for the estimated 35,000 tomato genes suggests that genes are contained on only a relatively small portion of the tomato chromosomes. Tomato chromosomes, like those of other species in the Solanaceae family, are composed of centromeric heterochromatin with more distal euchromatic regions (Khush and Rick, 1968
From the current study, the two BACs with the lowest gene densities (15 and 17 kb/gene) probably are located in or near the centromeric heterochromatin and also contain numerous retrotransposon-like sequences, which are characteristic of the centromeric regions of Arabidopsis (Figure 8) (Arabidopsis Genome Initiative, 2000
If 7 kb/gene is characteristic of euchromatin, which constitutes 23% of the tomato genome (Peterson et al., 1996
cDNA Library Construction The primary phage cDNA libraries were constructed and excised into bacterial cultures as phagemids according to the manufacturer's instructions (Stratagene; http://www.stratagene.com). The bacterial cultures were arrayed subsequently into 384-well plates and used for sequencing. Specific information pertaining to the individual libraries can be found online at http://sgn.cornell.edu. From each library, between 2500 and 10,000 high-quality sequence runs were produced, with various success rates for different libraries. The library collection was designed to maximize gene discovery, currently consisting of a considerable variety of >26 different libraries, capturing genes expressed in different tissue types and developmental stages or expressed during pathogen-elicited responses.
All sequencing of the cDNA clones was performed at the Institute for Genomic Research (http://www.tigr.org); the entire data set used in this report can be downloaded from the Solanaceae Genome Network through anonymous file transfer protocol (http://sgn.cornell. edu). The unigene set was constructed in accordance with TIGR's gene indexes, as described in Quackenbush et al. (2000)
Data Sets Used for Analyses
BAC Sequences
The tomato EST database was derived from cDNA libraries constructed at Cornell University and sequenced at TIGR. Sequencing of BAC T62O11 was accomplished by Richard McCombie and Manpreet Katari at the Cold Spring Harbor Laboratory. Thanks to Anne Frary and Todd Vision for critical review of the manuscript. This work was supported by Grant DBI-9872617 from the Plant Genome Program of the National Science Foundation to S.T. (Cornell University), J.G. (U.S. Department of Agriculture/Agricultural Research Service), and G.M. (Boyce Thompson Institute for Plant Research) and by Grant DBI-9813392 to TIGR.
Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010478. Received November 2, 2001; accepted April 18, 2002.
Adam, D. (2000). Now for the hard ones. Nature 408, 792793.[CrossRef][Medline] Arabidopsis Genome Initiative. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796815.[CrossRef][Medline] Arumuganathan, K., and Earle, E.D. (1991). Nuclear DNA content of some important plant species. Plant Mol. Biol. Rep. 9, 208219. Bernatzky, R., and Tanksley, S.D. (1986). Majority of random cDNA clones correspond to single loci in the tomato genome. Mol. Gen. Genet. 203, 814.[CrossRef]
Budiman, M.A., Mao, L., Wood, T.C., and Wing, R.A. (2000). A deep-coverage tomato BAC library and prospects toward development of an STC framework for genome sequencing. Genome Res. 10, 129136. Cary, J.W., Lax, A.R., and Flurkey, W.H. (1992). Cloning and characterization of cDNAs coding for Vicia faba polyphenol oxidase. Plant Mol. Biol. 20, 245253.[CrossRef][Web of Science][Medline] Chase, M.W., Soltis, D.E., Olmstead, R.G., Morgan, D., Les, D.H., Mishler, B.D., Duvall, M.R., Price, R.A., Hills, H.G., and Qiu, Y.-L. (1993). Phylogenetics of seed plants: An analysis of nucleotide sequences from the plastid gene rbcL. Ann. Mo. Bot. Gard. 80, 528580.[CrossRef] Deikman, J., and Fischer, R. (1988). Interaction of a DNA binding factor with the 5'-flanking region of an ethylene-responsive fruit ripening gene from tomato. EMBO J. 7, 33153320.[Web of Science][Medline]
Doebley, J., and Lukens, L. (1998). Transcriptional regulators and the evolution of plant form. Plant Cell 10, 10751082. Duan, X., Li, X., Xue, Q., Abo-el-Saad, M., Xu, D., and Wu, R. (1996). Transgenic rice plants harboring an introduced potato proteinase inhibitor II gene are insect resistant. Nat. Biotechnol. 14, 494498.[CrossRef][Web of Science][Medline]
Frary, A., Nesbitt, T.C., Grandillo, S., Knaap, E., Cong, B., Liu, J., Meller, J., Elber, R., Alpert, K.B., and Tanksley, S.D. (2000). Fw2.2: A quantitative trait locus key to the evolution of tomato fruit size. Science 289, 8588.
Fulton, T., van der Hoeven, R., Eannetta, N., and Tanksley, S. (2002). Identification, analysis, and utilization of conserved ortholog set markers for comparative genomics in higher plants. Plant Cell 14, 14571467. International Human Genome Sequencing Consortium. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860921.[CrossRef][Medline]
Johnson, R., Narvaez, J., An, G., and Ryan, C. (1989). Expression of proteinase inhibitors I and II in transgenic tobacco plants: Effects on natural defense against Manduca sexta larvae. Proc. Natl. Acad. Sci. USA 86, 98719875.
Keys, D.N., Lewis, D.L., Selegue, J.E., Pearson, B.J., Goodrich, L.V., Johnson, R.L., Gates, J., Scott, M.P., and Carrol, S.B. (1999). Recruitment of a hedgehog regulatory circuit in butterfly eyespot evolution. Science 283, 532534. Khush, G.S., and Rick, C.M. (1968). Cytogenetic analysis of the tomato genome by means of induced deficiencies. Chromosoma 23, 452484.[CrossRef]
Khush, G.S., Rick, C.M., and Robinson, R.W. (1964). Genetic activity in a heterochromatic chromosome segment of the tomato. Science 145, 14321434.
King, M.C., and Wilson, A.C. (1975). Evolution at two levels in humans and chimpanzees. Science 188, 107116.
Ku, H.K., Vision, T., Liu, J., and Tanksley, S.D. (2000). Comparing sequenced segments of the tomato and Arabidopsis genomes: Large-scale duplication followed by selective gene loss creates a network of synteny. Proc. Natl. Acad. Sci. USA 97, 91219126. Kumar, A., Altabella, T., Taylor, M.A., and Tiburcio, A.F. (1997). Recent advances in polyamine research. Trends Plant Sci. 2, 124130.
Lagercrantz, U., and Axelsson, T. (2000). Rapid evolution of the family of CONSTANS like genes in plants. Mol. Biol. Evol. 17, 14991507. Matton, D.P., and Brisson, N. (1989). Cloning, expression, and sequence conservation of pathogenesis-related gene transcripts of potato. Mol. Plant-Microbe Interact. 2, 325331.[Web of Science][Medline] Murata, M., Tsurutani, M., Hagiwara, S., and Homma, S. (1997). Subcellular location of polyphenol oxidase in apples. Biosci. Biotechnol. Biochem. 61, 14951499.[Medline]
Paterson, A.H., Bowers, J.E., Burow, M.D., Draye, X., Elsik, C.G., Jiang, C.-X., Katsar, C.S., Lan, T.-H., Lin, Y.-R., Ming, R., and Wright, R.J. (2000). Comparative genomics of plant chromosomes. Plant Cell 12, 15231540.
Pennisi, E. (1998). A bonanza for plant genomics. Science 282, 652654. Peterson, D.G., Pearson, W.R, and Stack, S.M. (1998). Characterization of the tomato (Lycopersicon esculentum) genome using in vitro and in situ DNA reassociation. Genome 41, 346356.[CrossRef] Peterson, D.G., Price, H.J., Johnston, J.S., and Stack, S.M. (1996). DNA content of heterochromatin and euchromatin in tomato (Lycopersicon esculentum) pachytene chromosomes. Genome 39, 7782.
Quackenbush, J., Liang, F., Holt, I., Pertea, G., and Upton, J. (2000). The TIGR gene indices: Reconstruction and representation of expressed gene sequences. Nucleic Acids Res. 28, 141145. Rick, C.M. (1975). The tomato. In Handbook of Genetics, Vol. 2, R.C. King, ed (New York: Plenum Press), pp. 247280. Souer, E., van Houwelingen, A., Kloos, D., Mol, J., and Koes, R. (1996). The no apical gene of Petunia is required for pattern formation in embryos and flowers and is expressed at meristem and primordia boundaries. Cell 85, 159170.[CrossRef][Web of Science][Medline] Stern, D.L. (2000). Evolutionary developmental biology and the problem of variation. Evolution 54, 10791091.[CrossRef][Web of Science][Medline] Tanksley, S.D., et al. (1992). High density molecular linkage maps of the tomato and potato genomes. Genetics 132, 11411160.[Abstract] Tiburcio, A.F., Altabell, T., Borrell, A., and Masgrau, C. (1997). Polyamine biosynthesis and its regulation. Physiol. Plant. 100, 664674.[CrossRef] Wang, R., Stec, A., Hey, J., Lukens, L., and Doebley, J. (1999). The limits of selection during maize domestication. Nature 398, 236239.[CrossRef][Medline] Yang, Y.W., Lai, K.N., Tai, P.Y., and Li, W.H. (1999). Rates of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J. Mol. Evol. 48, 597604.[CrossRef][Web of Science][Medline] This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | THE PLANT CELL | PLANT PHYSIOLOGY | |
|---|---|---|---|