Skip to main content

Main menu

  • Home
  • Content
    • Current Issue
    • Archive
    • Preview Papers
  • About
    • Editorial Board and Staff
    • About the Journal
    • Terms & Privacy
  • More
    • Alerts
    • Contact Us
  • Submit a Manuscript
    • Instructions for Authors
    • Submit a Manuscript
  • Other Publications
    • Plant Physiology
    • The Plant Cell
    • Plant Direct
    • The Arabidopsis Book
    • Teaching Tools in Plant Biology
    • ASPB
    • Plantae

User menu

  • My alerts
  • Log in

Search

  • Advanced search
Plant Cell
  • Other Publications
    • Plant Physiology
    • The Plant Cell
    • Plant Direct
    • The Arabidopsis Book
    • Teaching Tools in Plant Biology
    • ASPB
    • Plantae
  • My alerts
  • Log in
Plant Cell

Advanced Search

  • Home
  • Content
    • Current Issue
    • Archive
    • Preview Papers
  • About
    • Editorial Board and Staff
    • About the Journal
    • Terms & Privacy
  • More
    • Alerts
    • Contact Us
  • Submit a Manuscript
    • Instructions for Authors
    • Submit a Manuscript
  • Follow PlantCell on Twitter
  • Visit PlantCell on Facebook
  • Visit Plantae
LetterLetter to the Editor
Open Access

Araport Lives: An Updated Framework for Arabidopsis Bioinformatics

Asher Pasha, Shabari Subramaniam, Alan Cleary, Xingguo Chen, Tanya Berardini, Andrew Farmer, Christopher Town, Nicholas Provart
Asher Pasha
Bio-Analytic Resource for Plant Biology, Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto Toronto, Ontario M5S 3B2, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Asher Pasha
Shabari Subramaniam
The Arabidopsis Information Resource/Phoenix Bioinformatics Fremont, California 94538
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shabari Subramaniam
Alan Cleary
National Center for Genome Resources Santa Fe, New Mexico 87505
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alan Cleary
Xingguo Chen
The Arabidopsis Information Resource/Phoenix Bioinformatics Fremont, California 94538
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Xingguo Chen
Tanya Berardini
The Arabidopsis Information Resource/Phoenix Bioinformatics Fremont, California 94538
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tanya Berardini
Andrew Farmer
National Center for Genome Resources Santa Fe, New Mexico 87505
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrew Farmer
Christopher Town
J. Craig Venter Institute Rockville, Maryland 20850
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Christopher Town
Nicholas Provart
Bio-Analytic Resource for Plant Biology, Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario M5S 3B2, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nicholas Provart

Published September 2020. DOI: https://doi.org/10.1105/tpc.20.00358

  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading
  • © 2020 American Society of Plant Biologists. All rights reserved.

Conceived as a replacement for the anticipated retirement of The Arabidopsis Information Resource (TAIR), the Araport project was funded by the U.S. National Science Foundation (NSF) in 2013 to develop a new, extensible framework for Arabidopsis (Arabidopsis thaliana) bioinformatics that would facilitate data integration through the federation of distributed informatics resources. Recommended as a 5-year award, funds were initially provided for 2 years with a subsequent 6-month extension to allow for the development of a plan for the continuation of funding that was acceptable to NSF. When funds were exhausted in 2016 with the continuation award still awaiting a decision, the Araport site continued in a maintenance mode with minimal input from legacy personnel at each institution and with no data updates. The renewal request was finally declined in December 2018, leaving Araport with an uncertain future. In light of the critical importance of these services to the scientific community, a group of interested researchers (see Appendix) met in March 2019 to discuss options and propose a solution. A working group evolved from those in attendance at that meeting and has since met monthly to solidify and coordinate the execution of these nascent plans. The results have been encouraging and are described here to inform and inspire the larger plant science community.

Given the complete absence of external funding, it was agreed that, rather than try to perpetuate the entire Araport ecosystem, efforts should be directed toward maintaining the most attractive and most used features, namely ThaleMine and JBrowse, by transferring them to new ownership for perpetuation. Thus it was agreed that an updated version of ThaleMine would be established at the Bio-Analytic Resource (BAR) for Plant Biology at the University of Toronto under the leadership of Nicholas Provart and an updated version of JBrowse would be established as part of TAIR under the Phoenix Bioinformatics umbrella overseen by Tanya Berardini and Eva Huala.

JBROWSE

The JBrowse functionality provided by Araport has been successfully moved to TAIR. Araport had been running version 1.11.6 of the software with a set of tracks that included community submissions. Some of the tracks at the legacy location were no longer functioning after the underlying software (ADAMA) connecting them to outside resources lost support. TAIR installed the latest JBrowse version (1.16.6), replicated the tracks that were functional at Araport, and restored access to the nonfunctional tracks. In addition, two sets of newly integrated community-submitted tracks are now visible in this genome browser. One is a set of 41 tracks representing a multipronged gene expression experiment to track the response to various abiotic stresses from Lee and Bailey-Serres (2019). The other is a set of 4 tracks based on Cap Analysis of Gene Expression (CAGE) experiments to determine promoter bidirectionality performed by Thieffry et al. (2019). The CAGE data are visualized using the Stranded View plugin (Hofmeister and Schmitz, 2018), which allows separation of the display of expression values into plus and minus strands in a single track. New community tracks continue to be added, and existing track information is updated as new data become available.

THALEMINE

ThaleMine at the BAR was completely rebuilt using the latest InterMine software. The legacy ThaleMine version had not been updated since 2016 and was using the InterMine version 1.8.5, which was not forward-compatible with the latest InterMine version (4.2.0). At the time of writing, the most recent versions of publicly available data have been loaded, as listed in Table 1.

View this table:
  • View inline
  • View popup
Table 1

Data Sources for a New Instance of ThaleMine

As with any instance of InterMine, the BAR’s version of ThaleMine at https://bar.utoronto.ca/thalemine/ continues to support application programming interface functionalities, in addition to the extensive web-based query options. It is also compatible with the InterMine BlueGenes interface.

GENOME CONTEXT VIEWER

As part of the unsuccessful renewal proposal, some aspects of the Araport Comparative Genomics functionalities were planned to be addressed with an instance of the Genome Context Viewer (GCV) software developed at the National Center for Genome Resources (NCGR) by Andrew Farmer and Alan Cleary (Cleary and Farmer, 2018). This viewer was originally developed as part of an NSF-funded initiative for federating disparate legume-focused information resources. It provides services to enable the dynamic comparison of multiple genomes on the basis of their shared functional elements (e.g., genes) and provides an intuitive and powerful user interface for exploring similarities and differences among a set of genomic segments with respect to element content and arrangement. A version of the GCV has been installed and is now running from NCGR (https://gcv-arabidopsis.ncgr.org) as the third component of the “second-generation” Araport (see figure). This version of the GCV provides integration of the Arabidopsis Columbia reference genome (TAIR10/Araport11) with genomes from several other data sources, including two sets of newly assembled Arabidopsis genomes of various accessions (colloquially often called ecotypes) from Jiao and Schneeberger (2020) and from the 1001 Genomes project from Detlef Weigel and colleagues (Felix Bemm, Christian Kubica, and Detlef Weigel, personal communication), as well as a number of Brassicaceae genomes from Phytozome and the Brassicaceae Map Alignment Project initiative. The viewer provides convenient links to related resources for genes and genomic regions, thereby facilitating traversal into the other components of the reconfigured Araport project as well as other relevant tools. The gene family classifications utilized by the current instance are based on PANTHER 14.1 (Mi et al., 2013), and links are provided to the trees developed for these families by the PhyloGenes project (phylogenes.org).

Figure 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure 1.

Screenshot of New Arabidopsis GCV Showing a Region with Two Clusters of Germin-Like Proteins (PTHR31238 Gene Family, Denoted Here as Purple).

The central cluster shows extensive copy number variation among annotations from 14 Arabidopsis genomes and the closely related Arabidopsis lyrata genome (labeled araly.scaffold_7), as highlighted by the asterisks along the bottom. Other apparent copy number variations and presence/absence events can easily be observed.

LONG LIVE ARAPORT!

To establish continuity between the original Araport and these new functionalities, http://araport.org/ is now hosted at BAR and visitors are then presented with links to the new and maintained versions of ThaleMine, JBrowse, and the GCV. With these new sites operational, the original Araport site hosted at the Texas Advanced Computing Center has been shut down because of security issues related to the legacy versions of the packages used by the original site. We expect that the new Araport in its various component parts will continue to be widely used not just by Arabidopsis researchers but by the wider plant community.

In summary, a grassroots effort by committed community members has built upon the resources developed by the Araport project to provide continuity of Araport’s most used and useful features. It is gratifying to see that the vision of the 2012 white paper (International Arabidopsis Informatics Consortium, 2012) suggesting a future for Arabidopsis informatics as a community effort accomplished by a federation of independent community members has, in a modest way, come to pass. March 2020 saw 10,376 views of the ThaleMine landing page, showing a wide uptake by the community. That said, this rescue effort is not really a sustainable solution. Data curation and database maintenance are of vital importance and, notwithstanding TAIR’s successful subscription model, is something that is worthy of support by national funding agencies for the continued success of plant research in the United States and worldwide.

Acknowledgments

We are especially grateful to the scientists at the Texas Advanced Computing Center, especially Erik Ferlanti, John Fonner, and Matt Vaughn, for continuing to host and maintain Araport long after its “use-by” date. We thank Vivek Krishnakumar for providing insights and advice on the inner workings of Araport during the transition period. Araport was supported by grants from the National Science Foundation (grant DBI-1262414) and the Biotechnology and Biological Sciences Research Council (grant BB/L027151/1). Development of the GCV was supported by USDA-ARS project funding for the Legume Information System and the National Science Foundation (grant IOS-1444806 to A.F.). The J. Craig Venter Institute workshop that launched the Araport recovery effort was supported by the U.S. National Science Foundation (MCB Award 1062348, made to the U.S. members of the International Arabidopsis Informatics Consortium Steering Committee). The BAR is supported through a grant to N.P. from the National Sciences and Engineering Research Council of Canada and from Genome Canada/Ontario Genomics. TAIR is managed by the nonprofit Phoenix Bioinformatics Corporation and is supported through institutional, lab, and personal subscriptions.

APPENDIX

List of participants, “Future of Araport” meeting held at the J. Craig Venter Institute (JCVI) in Rockville, Maryland, March 25 and 26, 2019.

Tanya Berardini, Phoenix Bioinformatics

Agnes Chan, JCVI

Yongwook Choi, JCVI

Andrew Farmer, NCGR

Erik Ferlanti, Texas Advanced  Computing Center

Eva Huala, Phoenix Bioinformatics

Vivek Krishnakumar, formerly JCVI

Sean May, University of Nottingham

Asher Pasha, University of Toronto

Nicholas Provart, University of Toronto

David Somers, Ohio State University

Chris Town, JCVI

Eve Wurtele, Iowa State University

Remotely via BlueGenes:

Sam Hokin, NCGR

Eric Lyons, University of Arizona

Todd Michael, JCVI

Footnotes

  • www.plantcell.org/cgi/doi/10.1105/tpc.20.00358

  • ↵1 These authors contributed equally to this work.

  • ↵[OPEN] Articles can be viewed without a subscription.

  • Received May 11, 2020.
  • Revised June 25, 2020.
  • Accepted July 17, 2020.
  • Published July 22, 2020.

References

    1. Arabidopsis Genome Initiative
    (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796–815.
    OpenUrlCrossRefPubMed
    1. Ashburner, M., et al.
    (2000). Gene Ontology: Tool for the unification of biology. Nat. Genet. 25: 25–29.
    OpenUrlCrossRefPubMed
    1. Berardini, T.Z., et al.
    (2004). Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135: 745–755.
    OpenUrlAbstract/FREE Full Text
    1. Brady, S.M.,
    2. Provart, N.J.
    (2009). Web-queryable large-scale data sets for hypothesis generation in plant biology. Plant Cell 21: 1034–1051.
    OpenUrlAbstract/FREE Full Text
    1. Chatr-Aryamontri, A., et al.
    (2017). The BioGRID interaction database: 2017 update. Nucleic Acids Res. 45: D369–D379.
    OpenUrlCrossRefPubMed
    1. Cheng, C.-Y.,
    2. Krishnakumar, V.,
    3. Chan, A.,
    4. Schobel, S.,
    5. Town, C.D.
    (2016). Araport11: A complete reannotation of the Arabidopsis thaliana reference genome. bioRxiv 047308.
  1. ↵
    1. Cleary, A.,
    2. Farmer, A.
    (2018). Genome Context Viewer: Visual exploration of multiple annotated genomes using microsynteny. Bioinformatics 34: 1562–1564.
    OpenUrl
    1. Goodstein, D.M.,
    2. Shu, S.,
    3. Howson, R.,
    4. Neupane, R.,
    5. Hayes, R.D.,
    6. Fazo, J.,
    7. Mitros, T.,
    8. Dirks, W.,
    9. Hellsten, U.,
    10. Putnam, N.,
    11. Rokhsar, D.S.
    (2012). Phytozome: A comparative platform for green plant genomics. Nucleic Acids Res. 40: D1178–D1186.
    OpenUrlCrossRefPubMed
  2. ↵
    1. Hofmeister, B.T.,
    2. Schmitz, R.J.
    (2018). Enhanced JBrowse plugins for epigenomics data visualization. BMC Bioinformatics 19: 159.
    OpenUrlCrossRef
    1. Huala, E., et al.
    (2001). The Arabidopsis Information Resource (TAIR): A comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 29: 102–105.
    OpenUrlCrossRefPubMed
  3. ↵
    1. International Arabidopsis Informatics Consortium
    (2012). Taking the next step: Building an Arabidopsis information portal. Plant Cell 24: 2248–2256.
    OpenUrlAbstract/FREE Full Text
  4. ↵
    1. Jiao, W.-B.,
    2. Schneeberger, K.
    (2020). Chromosome-level assemblies of multiple Arabidopsis genomes reveal hotspots of rearrangements with altered evolutionary dynamics. Nat. Commun. 11: 989.
    OpenUrl
    1. Kerrien, S., et al.
    (2012). The IntAct molecular interaction database in 2012. Nucleic Acids Res. 40: D841–D846.
    OpenUrlCrossRefPubMed
  5. ↵
    1. Lee, T.A.,
    2. Bailey-Serres, J.
    (2019). Integrative analysis from the epigenome to translatome uncovers patterns of dominant nuclear regulation during transient stress. Plant Cell 31: 2573–2595.
    OpenUrlAbstract/FREE Full Text
    1. Maglott, D.,
    2. Ostell, J.,
    3. Pruitt, K.D.,
    4. Tatusova, T.
    (2007). Entrez Gene: Gene-centered information at NCBI. Nucleic Acids Res. 35: D26–D31.
    OpenUrlCrossRefPubMed
  6. ↵
    1. Mi, H.,
    2. Muruganujan, A.,
    3. Thomas, P.D.
    (2013). PANTHER in 2013: Modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41: D377–D386.
    OpenUrlCrossRefPubMed
    1. Mitchell, A.L., et al.
    (2019). InterPro in 2019: Improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 47: D351–D360.
    OpenUrl
    1. Obayashi, T.,
    2. Okamura, Y.,
    3. Ito, S.,
    4. Tadaka, S.,
    5. Aoki, Y.,
    6. Shirota, M.,
    7. Kinoshita, K.
    (2014). ATTED-II in 2014: Evaluation of gene coexpression in agriculturally important plants. Plant Cell Physiol. 55: e6.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Thieffry, A.,
    2. Bornholdt, J.,
    3. Ivanov, M.,
    4. Brodersen, P.,
    5. Sandelin, A.
    (2019). Characterization of Arabidopsis thaliana promoter bidirectionality and antisense RNAs by depletion of nuclear RNA decay enzymes. bioRxiv 809194.
    1. UniProt Consortium
    (2007). The Universal Protein Resource (UniProt). Nucleic Acids Res. 35: D193–D197.
    OpenUrlCrossRefPubMed
    1. Winter, D.,
    2. Vinegar, B.,
    3. Nahal, H.,
    4. Ammar, R.,
    5. Wilson, G.V.,
    6. Provart, N.J.
    (2007). An “Electronic Fluorescent Pictograph” browser for exploring and analyzing large-scale biological data sets. PLoS One 2: e718.
    OpenUrlCrossRefPubMed
PreviousNext
Back to top

Table of Contents

Print
Download PDF
Email Article

Thank you for your interest in spreading the word on Plant Cell.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Araport Lives: An Updated Framework for Arabidopsis Bioinformatics
(Your Name) has sent you a message from Plant Cell
(Your Name) thought you would like to see the Plant Cell web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Araport Lives: An Updated Framework for Arabidopsis Bioinformatics
Asher Pasha, Shabari Subramaniam, Alan Cleary, Xingguo Chen, Tanya Berardini, Andrew Farmer, Christopher Town, Nicholas Provart
The Plant Cell Sep 2020, 32 (9) 2683-2686; DOI: 10.1105/tpc.20.00358

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
Araport Lives: An Updated Framework for Arabidopsis Bioinformatics
Asher Pasha, Shabari Subramaniam, Alan Cleary, Xingguo Chen, Tanya Berardini, Andrew Farmer, Christopher Town, Nicholas Provart
The Plant Cell Sep 2020, 32 (9) 2683-2686; DOI: 10.1105/tpc.20.00358
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • JBROWSE
    • THALEMINE
    • GENOME CONTEXT VIEWER
    • LONG LIVE ARAPORT!
    • Acknowledgments
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • PDF

In this issue

The Plant Cell: 32 (9)
The Plant Cell
Vol. 32, Issue 9
Sep 2020
  • Table of Contents
  • Table of Contents (PDF)
  • Cover (PDF)
  • About the Cover
  • Index by author
View this article with LENS

More in this TOC Section

  • Ready, Primed, Go: Ending the Racism Pandemic in Science
  • Planting Equity: Using What We Know to Cultivate Growth as a Plant Biology Community
Show more LETTER TO THE EDITOR

Similar Articles

Our Content

  • Home
  • Current Issue
  • Plant Cell Preview
  • Archive
  • Teaching Tools in Plant Biology
  • Plant Physiology
  • Plant Direct
  • Plantae
  • ASPB

For Authors

  • Instructions
  • Submit a Manuscript
  • Editorial Board and Staff
  • Policies
  • Recognizing our Authors

For Reviewers

  • Instructions
  • Peer Review Reports
  • Journal Miles
  • Transfer of reviews to Plant Direct
  • Policies

Other Services

  • Permissions
  • Librarian resources
  • Advertise in our journals
  • Alerts
  • RSS Feeds
  • Contact Us

Copyright © 2021 by The American Society of Plant Biologists

Powered by HighWire