- © 2015 American Society of Plant Biologists. All rights reserved.
RNA is decorated by various chemical modifications that affect the stability and localization of this fragile molecule. These diverse, highly conserved, covalent modifications also help RNA perform its crucial roles in the cell. Over 100 RNA modifications have been identified, primarily in noncoding RNAs (e.g., tRNA and rRNA). Less is known about the chemical modification of mRNA, although there is much interest in uncovering the epitranscriptome, i.e., biochemical changes that regulate gene expression through modifying mRNA (reviewed in Lee et al., 2014). Identifying such changes isn’t easy, as mRNA likely uses the same modifying enzymes as noncoding RNAs, making it difficult to uncover the roles of these modifications through the conventional route of genetic knockdown. Most methods for detecting RNA modifications involve transcribing RNA into cDNA with reverse transcriptase and looking for base pair substitutions, as this enzyme often substitutes the encoded base pair with an alternate base pair in the presence of a modified nucleotide. Ryvkin et al. (2013) took this method to new heights with the development of HAMR (high-throughput annotation of modified ribonucleotides). This clever technique is based on the notion that since all high-throughput library preparation protocols require the conversion of RNA to cDNA by reverse transcriptase, single-nucleotide substitutions can be detected in huge RNA-seq data sets, enabling rapid, reliable detection of RNA modifications that affect the Watson-Crick base-pairing edge throughout the transcriptome, even in mRNA.
Vandivier et al. (2015) used the powerful HAMR pipeline to explore the epitranscriptome of Arabidopsis thaliana (see figure). The authors based their analysis on a set of uniquely mapping reads from unopened Arabidopsis flower buds obtained in parallel using three RNA-seq approaches: sequencing of small RNA via smRNA-seq, sequencing of poly(A)+-selected RNA via RNA-seq, and analysis of uncapped, degrading RNA via global mapping of uncapped and cleaved transcripts (GMUCT). Their investigation revealed RNA modifications at three different levels of the transcriptome. The reads were then mapped to the reference genome, and mismatches for each base were tabulated. The significance of the mismatches was tested, and changes likely representing sequencing errors, single nucleotide polymorphisms, and RNA editing sites were filtered out.
Scheme for identifying covalent, HAMR-predicted modifications in the Arabidopsis transcriptome. Reads from the three libraries are mapped to the reference genome, and mismatches (red bases) for each base (bold) are tabulated. After two rounds of testing by HAMR, predicted modifications are classified based on a training set of known tRNA modifications from yeast. Light-blue circles represent 7-methylguanosine caps on mRNA. (Figure courtesy of B.D. Gregory.)
Among the identified RNA modifications, an average of 1207 HAMR-predicted modifications per million accessible bases were detected in the GMUCT data set compared with 602 in the smRNA-seq data set and 15 in the RNA-seq data set, suggesting that uncapped, degrading mRNAs are strongly enriched for these modifications. The authors began to explore this rich resource, finding that modifications in degrading mRNA tend to localize to the coding sequence and 3′ untranslated regions. By contrast, those in stable mRNA are primarily located in alternatively spliced introns, perhaps because these modifications help regulate alternative splicing, an intriguing possibility that warrants further study.
The RNA modifications were classified by a machine learning algorithm that uses the substitution patterns from a yeast (Saccharomyces cerevisiae) smRNA-seq data set at known tRNA modification sites as its training set. Certain classes of chemical modifications tended to occur in stable mRNAs, whereas others tended to occur in uncapped, degrading transcripts, representing a possible cause (or consequence) of RNA turnover. Finally, Gene Ontology analysis revealed that many RNA modifications mark mRNAs encoding proteins involved in stress responses. These findings shed light on the newly revealed Arabidopsis epitranscriptome, with many more discoveries yet to come.