- © 2016 American Society of Plant Biologists. All rights reserved.
Abstract
High-throughput approaches for profiling the 5′ ends of RNA degradation intermediates on a genome-wide scale are frequently applied to analyze and validate cleavage sites guided by microRNAs (miRNAs). However, the complexity of the RNA degradome other than miRNA targets is currently largely uncharacterized, and this limits the application of RNA degradome studies. We conducted a global analysis of 5′-truncated mRNA ends that mapped to coding sequences (CDSs) of Arabidopsis thaliana, rice (Oryza sativa), and soybean (Glycine max). Based on this analysis, we provide multiple lines of evidence to show that the plant RNA degradome contains in vivo ribosome-protected mRNA fragments. We observed a 3-nucleotide periodicity in the position of free 5′ RNA ends and a bias toward the translational frame. By examining conserved peptide upstream open reading frames (uORFs) of Arabidopsis and rice, we found a predominance of 5′ termini of RNA degradation intermediates that were separated by a length equal to a ribosome-protected mRNA fragment. Through the analysis of RNA degradome data, we discovered uORFs and CDS regions potentially associated with stacked ribosomes in Arabidopsis. Furthermore, our analysis of RNA degradome data suggested that the binding of Arabidopsis ARGONAUTE7 to a noncleavable target site of miR390 might directly hinder ribosome movement. This work demonstrates an alternative use of RNA degradome data in the study of ribosome stalling.
INTRODUCTION
Steady state levels of RNA are controlled by relative rates of transcription and RNA degradation. Most mRNAs in eukaryotes possess a 7-methylguanosine cap at the 5′ terminus and a poly(A) tail at the 3′ terminus, which are crucial for translation and RNA stability. The loss of the 5′ cap or the 3′ poly(A) tail abolishes mRNA translation and promotes mRNA degradation (Gallie, 1991). Uncapped 5′ ends of mRNAs are degraded by 5′-3′ exoribonucleases (XRNs), whereas deadenylated mRNAs are degraded by the exosome from the 3′ end (Lebreton and Séraphin, 2008; Houseley and Tollervey, 2009). Alternatively, deadenylation can also trigger decapping of mRNA, followed by degradation from the 5′ end (Muhlrad et al., 1994).
Translation plays a crucial role in controlling mRNA stability and is required to eliminate aberrant mRNAs through several distinct mechanisms (Shoemaker and Green, 2012). Nonsense-mediated mRNA decay (NMD) is often initiated when a ribosome encounters a premature termination codon (PTC) upstream of an exon junction complex. The interaction between NMD factors on termination factors and exon junction complexes promotes the degradation of PTC-containing transcripts. The movement of ribosomes on mRNA molecules can be stopped externally by a stable RNA structure or internally by the particular peptide that is encoded by the mRNA. The transcripts associated with stalled ribosomes are degraded by a specialized RNA surveillance pathway called no-go decay, which may result in endonucleolytic cleavage upstream of the stalled ribosomes (Doma and Parker, 2006). Nonstop decay also targets transcripts with stalled ribosomes, but the stalling is due to the lack of an in-frame stop codon (Frischmeyer et al., 2002; van Hoof et al., 2002). Many eukaryotic mRNAs possess short open reading frames (ORFs) in the 5′ untranslated region (UTR), which can be translated and potentially regulate mRNA stability and translation. The stop codon of a translated upstream open reading frame (uORF) might be recognized as a PTC and thus initiate NMD. Some uORFs encoding peptides conserved across species have been shown to stall ribosomes at uORF stop codons, resulting in the repression of downstream main ORF translation and acceleration of RNA degradation in a few cases (Gaba et al., 2005; Uchiyama-Kadokura et al., 2014).
In eukaryotes, small RNAs of 20 to 30 nucleotides play a key role in regulating gene expression through the RNA interference pathway. Most animal microRNAs (miRNAs) have a seed region, spanning the second to the seventh or the eighth nucleotide, which can base pair perfectly with the 3′ UTR of a target mRNA (Bartel, 2009). The targeting of animal miRNAs is often associated with translation repression, deadenylation, and mRNA decay using exoribonucleases. By contrast, plant miRNAs are highly complementary to their targets, and cleavage in the middle of target sites is the major mode of plant miRNA action (Rhoades et al., 2002). However, growing evidence indicates that plant miRNAs can also inhibit the translation of their targets (Brodersen et al., 2008; Iwakawa and Tomari, 2013; Li et al., 2013b, 2013c; Liu et al., 2013). In vitro assays of RNA-induced silencing complexes containing Arabidopsis thaliana ARGONAUTE1 (AGO1) demonstrated that plant miRNAs can repress translation initiation uncoupled with deadenylation or mRNA decay (Iwakawa and Tomari, 2013). Furthermore, miRNA binding sites in ORFs can hinder the movement of ribosomes (Iwakawa and Tomari, 2013). In Arabidopsis, miRNA-mediated translation repression occurs in the endoplasmic reticulum and requires the endoplasmic reticulum protein ALTERED MERISTEM PROGRAM1 (Li et al., 2013c).
High-throughput approaches for genome-wide profiling of RNA degradation intermediates that possess a free monophosphate at the 5′ terminus have been developed by several groups and are variously named parallel analysis of RNA ends (PARE) (German et al., 2008), degradome sequencing (Addo-Quaye et al., 2008), genome-wide mapping of uncapped transcripts (Gregory et al., 2008), and 5′ P sequencing (5Pseq) (Pelechano et al., 2015). Because intact mRNAs generally possess a 5′ cap that blocks their ligation to RNA adaptors, truncated 5′ RNA ends with a free monophosphate can be selectively sequenced by directly ligating poly(A) RNA with RNA adaptors. PARE and degradome sequencing have been widely applied in the identification of small RNA-guided cleavage sites in various plant species (Addo-Quaye et al., 2008; German et al., 2008; Zhou et al., 2010; Shamimuzzaman and Vodkin, 2012; Zhao et al., 2012; Li et al., 2013a). Specific PARE has been developed for the study of plant miRNA processing by specifically amplifying miRNA processing intermediates (Bologna et al., 2013). Although these approaches have been used to profile mRNA degradation intermediates in mutants impaired in XRNs or proteins with endonucleolytic activity (German et al., 2008; Harigaya and Parker, 2012; Schmidt et al., 2015), the interpretations of some results remain challenging because the complexity of the RNA degradome is currently still largely uncharacterized.
Previously, we showed that plant RNA degradome data potentially contain footprints of RNA binding proteins in the 3′ UTR (Hou et al., 2014). Pelechano et al. (2015) also demonstrated that yeast 5Pseq data contain in vivo ribosome footprints, the products of cotranslation mRNA decay. The change in these ribosome-protected termini captured by 5Pseq could reflect ribosome dynamics without the problems caused by translational inhibitors sometimes used in the generation of in vitro ribosome footprints. Codons associated with paused ribosomes in response to oxidative stress were identified from the analysis of yeast 5Pseq data. Cotranslation mRNA decay was also demonstrated to be mediated by XRN4 and involved in the reprogramming of gene expression under heat stress in Arabidopsis (Merret et al., 2013, 2015). Here, we further demonstrate that footprints of ribosomes are widespread in the RNA degradome of various plant species. Our genome-wide analysis of RNA degradation fragments revealed uORFs, coding sequence (CDS) regions, and noncleavable miRNA target sites that are potentially associated with stacked ribosomes. Our findings thus expand the application of plant RNA degradome data for the elucidation of posttranscriptional gene regulation beyond small RNA-guided cleavage.
RESULTS
Signatures of Ribosome Footprints Were Observed in the RNA Degradome
Our previous analysis of plant RNA degradome data revealed positional enrichment of 5′-truncated RNA ends in the proximity of motifs recognized by RNA binding proteins. This suggests that RNA binding proteins attached to RNA may hinder RNA degradation and result in protected RNA fragments (Hou et al., 2014). Therefore, we suspected that the binding of ribosomes to mRNA may also protect mRNA from in vivo degradation and leave ribosome footprints in the RNA degradome. To explore this possibility, we first used PARE data to plot the positional distribution of 5′-truncated mRNA ends in the junctions of the CDSs and UTRs. We predicted that, if PARE can capture ribosome footprints in the same way as ribosome profiling (Ribo-Seq), which delineates ribosome positions by generating ribosome-protected mRNA fragments through in vitro nuclease digestion (Ingolia, 2010), we would observe a 3-nucleotide periodicity in the CDSs. A 3-nucleotide periodicity reflects the stepwise movement of ribosomes during active translation and has been reported in the analyses of Ribo-Seq data derived from multiple species (Ingolia et al., 2009, 2011; Guo et al., 2010; Liu et al., 2013; Bazzini et al., 2014; Juntawong et al., 2014; Vasquez et al., 2014). Consistent with our prediction, 5′ ends of truncated mRNA (PARE data) generated from Arabidopsis seedlings and inflorescences show strong 3-nucleotide phasing in the 3′ terminus of the CDSs but not in the proximal region of the 3′ UTR (Figure 1A). A 3-nucleotide periodicity in the CDS region was also observed in the CDSs for PARE reads generated from rice (Oryza sativa) inflorescences and soybean (Glycine max) seeds (Figure 1A). Besides this phasing pattern, the three species all show preferential accumulation of PARE reads in the translational frame (frame 1) of annotated CDSs (Figure 1B). Although the enrichment in the translational frame is relatively small in PARE data compared with that in yeast Ribo-Seq data reported previously (Ingolia et al., 2009), the proportion of PARE reads falling in the translational frame is significantly higher than that in the other two frames in the three replicates of Arabidopsis inflorescence PARE data (Supplemental Figure 1). Similar to the previous finding in the analysis of Ribo-Seq data (Liu et al., 2013; Bazzini et al., 2014; Juntawong et al., 2014), PARE data of these three plant species also showed an evident increase in the number of reads at positions 16 and 17 nucleotides upstream of stop codons, a pattern consistent with the deceleration of ribosome movement during translational termination (Figure 1A). These common features shared between PARE data and Ribo-Seq data strongly suggest both the presence of in vivo ribosome-protected mRNA fragments in the plant RNA degradome and the occurrence of cotranslational RNA decay in plants.
5′-Truncated RNA Ends Show a 3-Nucleotide Periodicity and Frame Bias in the CDS.
(A) The positional distribution of 5′ ends of truncated RNA mapped to the regions surrounding the start codon and the stop codon of Arabidopsis, rice and soybean CDS. PARE data of Arabidopsis seedlings and inflorescences were generated by this study and PARE data of rice inflorescences and soybean seeds were published by Zhou et al. (2010) and Song et al. (2011). Blue bars indicate positions falling in the translational frame (frame 1) of annotated CDS if 5′-truncated ends represent the 5′ edge of a ribosome and the distance from the 5′ edge of a ribosome to the first base of the A site is 17 nucleotides. Red arrowheads beneath the graphs represent the first nucleotides in the start codon (left side) or the stop codon (right side). The illustration at the bottom shows the size of an mRNA fragment protected by a plant ribosome and the position of ribosomes decoding start and stop codon. CDS, dark blue; UTR, light blue; E, the exit site; P, the peptidyl site; A, the aminoacyl site.
(B) The proportion of 5′-truncated RNA ends mapped to complete CDS in all three frames. frame 1, the translational frame of TAIR annotated CDS; frames 2 and 3, the frames offset +1 and +2 from frame 1.
Regular and Conserved RNA Degradation Patterns Were Found in Conserved Peptide uORFs
Several uORFs in fungi, plants, and animals that encode conserved peptides are able to block ribosomes at stop codons (Wang and Sachs, 1997; Raney et al., 2000; Gaba et al., 2001; Hayden and Jorgensen, 2007; Hood et al., 2007; Uchiyama-Kadokura et al., 2014). Among them, the conserved peptide uORFs (CPuORFs) of yeast CPA1 and an Arabidopsis gene producing S-ADENOSYLMETHIONINE DECARBOXYLASE (SAMDC/AdoMetDC1) have been demonstrated to induce mRNA decay through the NMD pathway (Gaba et al., 2005; Uchiyama-Kadokura et al., 2014). To provide additional evidence that the RNA degradome contains ribosome footprints, we examined the positional distribution of 5′-truncated mRNA ends (PARE reads) derived from several CPuORFs of Arabidopsis. In SAMDC, the position 16 nucleotides upstream of the uORF stop codon shows a predominant accumulation of PARE reads derived from seedlings but only a weak enrichment of PARE reads derived from inflorescences (Supplemental Figure 2). Besides SAMDC, we also examined PARE reads that mapped to the CPuORFs in a small group of bZIP genes that regulate the translation of downstream ORFs in response to sucrose concentration (Wiese et al., 2004). Interestingly, these bZIP CPuORFs possess a ladder of PARE peaks at intervals of ∼30 nucleotides, which is the size of a ribosome-protected fragment in Arabidopsis (Liu et al., 2013; Juntawong et al., 2014) (Figure 2A). Counting from the 3′ end of bZIP CPuORFs, the first and second PARE peaks are positioned ∼16 and 46 nucleotides upstream of uORF stop codons, with a few reads present in the 30-nucleotide window between these two peaks. A third PARE peak at position −76 that was an additional 30 nucleotides upstream was observed in bZIP2 and bZIP11 CPuORFs (Figure 2A). The 30-nucleotide phasing of PARE peaks in the 3′ end of CPuORFs provides strong evidence to support the notion that PARE captures the degradation fragments protected by an array of stacked ribosomes. These three PARE peaks likely delineate the 5′ ends of degradation fragments protected by one, two, or three adjacent ribosomes stalled at a CPuORF stop codon, respectively. Notably, the 30-nucleotide phasing was not evident in these bZIP genes when we analyzed in vitro ribosome-protected mRNA fragments of two Ribo-Seq data sets that were generated by independent groups (Liu et al., 2013; Juntawong et al., 2014) (Figure 2A; Supplemental Figure 3).
bZIP uORFs Accumulate a Ladder of 5′-Truncated RNA Ends at 30-Nucleotide Intervals.
(A) The positional distribution of 5′ ends of truncated RNA generated by PARE and ribosome-protected mRNA fragments generated by Ribo-Seq in Arabidopsis bZIP2, bZIP11, and bZIP53 5′ UTRs. The PARE data of Arabidopsis inflorescences and the Ribo-Seq data of light-treated seedlings were retrieved from the data sets published by German et al. (2008) and Liu et al. (2013) respectively. The PARE data of Arabidopsis seedlings were generated by this study. Regions of CPuORFs are shown as dark blue lines under the graphs. The PARE peaks at positions 16, 46, and 76 nucleotides upstream of the uORF stop codon are highlighted in red. TP10M, tags per 10 million; TP50M, tags per 50 million.
(B) Modified RLM 5′ RACE assays of bZIP2 and bZIP11 transcripts in Arabidopsis seedlings treated with or without 6% sucrose. MYB65, a target of miR159, was used as a positive control for the modified RLM 5′ RACE assay. The brackets indicate the PCR products excised and cloned for Sanger sequencing (left panels). In total, 36 and 16 clones were sequenced for bZIP2 and bZIP11, respectively. The positional distribution of 5′-truncated RNA ends increased under 6% sucrose treatments is shown for the bZIP2 and bZIP11 5′ UTRs (right panels). Regions of CPuORFs are shown as dark blue lines under the graphs. TSS, transcriptional start site.
Because bZIP CPuORFs were reported to repress the translation of the downstream ORF under high sucrose concentration (Wiese et al., 2004), we used modified RNA ligase-mediated rapid amplification of cDNA ends (RLM 5′ RACE) to test whether treatment with 6% sucrose would affect the accumulation of degradation fragments truncated at these bZIP CPuORFs. Indeed, a larger number of 5′ ends of degradation intermediates corresponding to the first or second peak in bZIP2 and bZIP11 CPuORFs were detected in Arabidopsis seedlings treated with 6% sucrose compared with the untreated control plants (Figure 2B). This implies either an increase in ribosomes arrested at these two uORFs or the enhancement of RNA degradation. Taken together, these results indicate that degradation fragments appear to reflect the dynamics of ribosomes on uORFs.
Although 64 CPuORFs (Hayden and Jorgensen, 2007) have been identified in Arabidopsis by sequence comparison and annotated in The Arabidopsis Information Resource database (TAIR10 annotation), most of them have not been demonstrated to arrest ribosomes at specific positions. To further explore whether ribosome stalling at stop codons is a common mechanism underlying the regulation mediated by Arabidopsis CPuORFs, we globally analyzed the distribution of PARE reads mapped to the 3′ end of the 64 Arabidopsis CPuORFs. Some Arabidopsis CPuORFs predominantly accumulate PARE reads at position 16 nucleotides upstream of stop codons and some show an additional peak at position −45, −46, or −47 (Figure 3A). Besides bZIP genes, CPuORFs in genes encoding several basic helix-loop-helix-type transcription factors, a trehalose-6-phosphate phosphatase, two methyltransferases, and an unknown protein were found to possess PARE peaks at these two specific sites (Supplemental Figure 4), implying the stalling of ribosomes at these uORF stop codons. The analysis of 35 rice CPuORFs showed the enrichment of PARE reads at the same sites (positions −16 and −46) (Figure 3A), suggesting that ribosome stalling at the stop codon of some CPuORFs is a conserved mechanism across species. To know whether ribosome profiling could capture the signature of ribosome stacking in CPuORFs as PARE, we also analyzed in vitro ribosome-protected mRNA fragments of two Ribo-Seq data sets with the number of total reads greatly exceeding that of PARE data sets we used (Liu et al., 2013; Juntawong et al., 2014). Intriguingly, although position 16 or 17 nucleotides upstream of the uORF stop codon shows an enrichment of in vitro ribosome-protected mRNA fragments, the enrichment at these positions is less prominent and does not accompany the enrichment at the positions near 30 nucleotides upstream (Figure 3B). The same analysis on genome-wide predicted uORFs having lengths greater than 60 nucleotides showed no preferential accumulation of PARE reads for Arabidopsis or rice and Arabidopsis Ribo-Seq reads if predicted uORFs overlapping CPuORFs were excluded (Figures 3A and 3B). Based on the analysis of PARE data that harbor ribosome footprints, most Arabidopsis and rice uORFs may not cause ribosome stalling at stop codons in the same way as many CPuORFs, at least under the conditions and in the tissues that the PARE data were generated.
Site-Specific Enrichment of 5′-Truncated RNA Ends Is Evident in CPuORFs.
(A) Clustered heat maps of 5′-truncated RNA ends mapped in a 55-nucleotide region upstream of the stop codon of CPuORFs and predicted uORFs in wild-type Arabidopsis inflorescences and rice seedlings. Predicted uORFs overlapping CPuORFs are not included in the heat maps of predicted uORFs. Arabidopsis and rice PARE data used in this analysis were published by German et al. (2008) and Li et al. (2010), respectively.
(B) Clustered heat maps of 5′ RNA ends protected by ribosomes in a 55-nucleotides region upstream of the stop codon of CPuORFs and predicted uORFs in wild-type Arabidopsis. The Ribo-Seq data sets of light-treated and normoxic (in air) seedlings used in this analysis were published by Liu et al. (2013) and Juntawong et al. (2014), respectively.
In (A) and (B), the first nucleotides of the stop codon is assigned position 0, and the color of data points represents the Peak_Index value, which is calculated by dividing the number of PARE or Ribo-Seq reads starting at the position indicated by the number of total reads in a 31-nucleotide flanking region. The numbers of annotated CPuORFs and predicted uORFs (indicated in parentheses) and uORFs included in heat maps are shown above the heat maps.
Regulatory uORFs Were Identified Using the Patterns of RNA Degradation Fragments
We suspected that a few regulatory uORFs may not be conserved between Arabidopsis and rice or that they might have been missed in the previous search because of low sequence homology (Hayden and Jorgensen, 2007). Therefore, analysis of the RNA degradome might provide an alternative approach for the identification of regulatory uORFs that have the potential to stall ribosomes. We thus reverse searched for uORFs using PARE reads peaking at the regions 16 to 17 and 45 to 47 nucleotides upstream of the stop codon of predicted uORFs. In addition to four CPuORFs reported previously (Hayden and Jorgensen, 2007), we identified four Arabidopsis uORFs with two predominant PARE peaks representing two tandem ribosomes stacking at stop codons (Table 1; Supplemental Figure 5). Further analyses were then performed on these three novel uORFs to investigate sequence conservation and their regulatory functions.
A 99-nucleotide uORF in the 5′ UTR of CBL-INTERACTING PROTEIN KINASE6 (CIPK6) shows two PARE peaks at positions 16 and 46 nucleotides upstream of the stop codon like CPuORFs in bZIP genes (Figures 2 and 4A). We then compared the distribution of reads obtained through Ribo-Seq and PARE on CIPK6. The Ribo-Seq data of light-treated seedlings show a corresponding peak at position 16 nucleotides but not 46 nucleotides upstream of the stop codon (Figure 4A). Moreover, position −17 accumulates more reads than position −16 in Ribo-Seq data, whereas position −16 has the highest accumulation of PARE reads. Predominant accumulation of ribosome-protected 5′ ends at positions 16 and 46 upstream of CIPK6 stop codon was not observed in the Ribo-Seq data of normoxia (in air) seedlings (Supplemental Figure 6). Although the CIPK6 uORF was not annotated in the TAIR database, it encodes a conserved peptide and has been identified previously based on sequence homology by two groups (Takahashi et al., 2012; Vaughn et al., 2012). Similarly, the conserved uORF in soybean CIPK6 also possesses PARE peaks in 30-nucleotide increments at the same positions relative to the uORF stop codon (Figure 4B), implying that ribosome stalling might be a conserved mechanism for CIPK6 uORF regulation.
CIPK6 CPuORF Possesses Footprints of Stacked Ribosomes and Represses Downstream ORF Expression in Various Tissues.
(A) The positional distribution of 5′ ends of truncated RNA generated by PARE and ribosome-protected mRNA fragments generated by Ribo-Seq in Arabidopsis CIPK6 5′ UTR. The PARE data of Arabidopsis inflorescences and the Ribo-Seq data of light-treated seedlings plotted were retrieved from the data sets published by German et al. (2008) and Liu et al. (2013), respectively. TP10M, tags per 10 million.
(B) The positional distribution of 5′-truncated RNA ends generated by PARE in soybean CIPK6 5′ UTR. The soybean PARE data plotted were retrieved from the data set published by Song et al. (2011). In (A) and (B), regions of CPuORFs are shown as dark-blue lines under the graphs and the first nucleotide of the stop codon is assigned position 0. The PARE or Ribo-Seq peaks at positions 16, 46, and 76 nucleotides upstream of the uORF stop codon are highlighted in red.
(C) Histochemical staining of randomly selected Arabidopsis T1 transgenic plants carrying a GUS reporter gene driven by Arabidopsis CIPK6 promoter with wild-type [CIPK6pro:UTR(WT):GUS] and deleted uORF [CIPK6pro:UTR(ΔuORF):GUS]. The red arrowhead indicates the site mutated. Bar = 1 cm.
(D) Comparison of GUS activity between wild-type and ΔuORF transgenic plants. The amount of total protein was used for the normalization of GUS activity.
(E) Comparison of GUS mRNA level between wild-type and ΔuORF transgenic plants by qRT-PCR. The mRNA level of UBQ5 was used for the normalization of GUS mRNA level. In (D) and (E), each bar represents the mean of measurements derived from five independent T1 transgenic plants ± se relative to the measurement of the wild type. *P < 0.05; ns, no significant difference (two-tailed Student’s t test, n = 5).
The function of Arabidopsis CIPK6 uORF in repressing downstream ORF expression was validated via transient expression assays by Ebina et al. (2015). We confirmed the regulatory function of Arabidopsis CIPK6 uORF by generating stable transgenic lines that harbored a reporter gene encoding GUS driven by the Arabidopsis CIPK6 promoter containing a wild-type 5′ UTR or a uORF-deleted (ΔuORF) 5′ UTR in which the start codon was converted into a stop codon (Figure 4C). Overall, the ΔuORF transgenic lines showed a higher level of expression of the reporter gene in most tissues (Figure 4C; Supplemental Figure 7), indicating that the regulation mediated by CIPK6 uORF is widespread even under normal growth conditions. Because an approximate 6-fold difference (P = 0.02, two-tailed Student’s t test) was detected in the comparison of GUS activity, but the change of GUS mRNA level was less than 2-fold and not statistically significant (Figures 4D and 4E), CIPK6 uORF likely controls downstream ORF expression mainly at the translational level.
The other two candidates of ribosome stalling uORFs we identified are located in MYB34 and MYB51, which belong to the same clade of MYB transcription factors involved in the regulation of glucosinolate biosynthesis (Celenza et al., 2005; Gigolashvili et al., 2007; Frerigmann and Gigolashvili, 2014). Although MYB34 uORF is longer than MYB51 uORF, they both possess two predominant PARE peaks at positions 16 and 46 nucleotides upstream of uORF stop codons (Figure 5A). The translation of these two MYB uORFs is supported by the higher density of Ribo-Seq reads in the predicted ORFs compared with that in the flanking regions (Liu et al., 2013) (Figure 5A). However, the analysis of in vitro ribosome-protected mRNA fragments on these two MYB uORFs with two Ribo-Seq data sets (Liu et al., 2013; Juntawong et al., 2014) showed no preferential accumulation at these two sites (Figure 5A; Supplemental Figure 6). The peptide sequences encoded by these two uORFs are highly conserved at the C terminus within the mustard (Brassicaceae) family (Figure 5B). Glucosinolates are sulfur- and nitrogen-containing secondary metabolites that are found mainly in plant species in the order of Brassicales (Grubb and Abel, 2006), explaining the absence of conserved uORFs in many other plant species. The negative impact of MYB34 CPuORF on the mRNA level of MYB34 was demonstrated previously in a study of a mutant containing a PTC in this uORF (Bender and Fink, 1998). We confirmed the negative regulation of MYB34 uORF on the expression of the downstream reporter gene by converting the start codon into a stop codon in transient expression assays (Figure 5C). The abolition of MYB51 uORF slightly increased the expression of the reporter gene, but the change was not statistically significant under the conditions we used for transient expression assays (Figure 5C). The identification of MYB34 and MYB51 uORFs validates the use of RNA degradome data in discovering lineage or species specific regulatory uORFs.
Lineage-Specific CPuORFs in MYB34 and MYB51 Generate 5′-Truncated RNA Ends Separated in 30 Nucleotides and Negatively Regulate Downstream ORF Expression.
(A) The positional distribution of 5′ ends of truncated RNA generated by PARE and ribosome-protected mRNA fragments generated by Ribo-Seq in the 5′ UTR of MYB34 and MYB51. The PARE data of seedlings were generated by this study and the Ribo-Seq data plotted were retrieved from the data set published by Liu et al. (2013). Regions of CPuORFs are shown as dark-blue lines under the graphs and the first nucleotide of the stop codon is assigned position 0. The PARE or Ribo-Seq peaks at positions 16 and 46 upstream of the uORF stop codon are highlighted in red. TP50M, tags per 50 million.
(B) Alignment of peptides encoded by MYB34 and MYB51 uORFs in the Brassicaceae family. Ath, Arabidopsis thaliana; Aly, Arabidopsis lyrata; Bra, Brassica rapa; Bst, Boechera stricta; Cru, Capsella rubella; Esa, Eutrema salsugineum. The alignment is colored according to residue conservation: red, identical residues; orange, conserved residues; pink, block of similar residues.
(C) Transient expression assays of MYB34 and MYB51 uORF regulatory function using LUC reporter constructs in protoplasts. The reporter constructs of the wild type and deleted uORF (ΔuORF) are illustrated. The red arrowhead indicates the site mutated. The LUC reporter constructs were cotransfected with a control of a GUS gene driven by a 35S promoter. The LUC activity was first normalized to GUS activity and then to the value of wild-type construct. Each bar represents the mean of measurements derived from six independent protoplast transfections ± se relative to the measurement of the wild type. *P < 0.05; ns, no significant difference (two-tailed Student’s t test, n = 6).
Ribosome Stacking Was Predicted in CDSs Using the RNA Degradome Data
Besides CPuORFs, the nascent peptide encoded by a CDS region in the first exon of Arabidopsis CGS1, known as the MTO1 region, has been reported to block ribosome elongation and induce RNA decay in response to S-adenosyl-l-methionine (AdoMet) (Onouchi et al., 2005). Similar to the degradation fragments we observed in bZIP CPuORFs (Figure 2), Haraguchi et al. (2008) detected a ladder of truncated 5′ termini separated in length by ∼30-nucleotide increments in the MTO1 region together with the 5′ upstream region after treatment with AdoMet. Moreover, the previous study demonstrated that the truncated mRNA ends are defined by the 5′ edges of stalled ribosomes in an array. Therefore, we assumed that the RNA degradome could also be used in the identification of ribosome stacking occurring in the CDS region. To validate this idea, we first compared the PARE peaks around the MTO1 region with the 5′ ends of degradation intermediates reported previously (Yamashita et al., 2014). Although the plants we used for PARE library construction were grown in soil without AdoMet treatment, the major PARE peaks were in close proximity to the 5′ termini of degradation intermediates reported previously (Supplemental Figure 8).
Next, we performed phasing analysis on PARE peaks in the CDSs for intervals from 20 to 40 nucleotides. In the analysis of Arabidopsis seedling PARE data, the number of phased regions decreased when the length of intervals increased (Supplemental Figure 9A). However, the numbers of phased regions identified at intervals of 32 to 40 nucleotides and 28 nucleotides were significantly lower than that at the interval of 30 nucleotides in the inflorescence PARE data (Supplemental Figure 9B). To eliminate the phased regions occurring by chance, we discarded candidates identified only in a single inflorescence sample. This resulted in four phased regions at the intervals of 29 and 30 nucleotides and one to three phased regions at the other intervals (Supplemental Figure 9C). Based on this result, we thus discovered four Arabidopsis protein-coding genes possessing a region potentially associated with protection signatures of stacked ribosomes (Figure 6). PARE peaks with prominent phasing were identified in genes encoding plastidic type I signal peptidase 2B (PLSP2B), a pentatricopeptide repeat protein, a RING/U-box superfamily protein, and an unknown protein (Figure 6). However, Ribo-Seq peaks separated in a 30-nucleotide interval were not detected in these regions, although the position of the most 3′ phased PARE peak identified in PLSP2B showed predominant accumulation of Ribo-Seq reads (Liu et al., 2013; Juntawong et al., 2014) (Figure 6; Supplemental Figure 10). Notably, the A sites of stalled ribosomes putatively associated with these four regions were all predicted to fall in frame 2 but not in the translational frame annotated in TAIR (frame 1) if we assumed that the distance from the ribosome-protected 5′ end to the presumed A site was 17 nucleotides according to the result shown in Figure 1. Stalling in the non-translational frame may suggest that the last pausing ribosome in the four CDS regions we identified is arrested during the step of translocation but not decoding.
Analysis of PARE Data Reveals CDS Regions Potentially Associated with Stacked Ribosomes.
The positional distribution of 5′ ends of truncated RNA generated by PARE and ribosome-protected mRNA fragments generated by Ribo-Seq in CDS regions potentially associated with stacked ribosomes. PARE data of Arabidopsis inflorescences plotted were generated by this study while the Ribo-Seq data plotted were retrieved from the data set published by Liu et al. (2013). The 30-nucleotide phased PARE peaks and the corresponding Ribo-Seq peaks are highlighted in red with coordinates indicated above. TP50M, tags per 50 million.
Analysis of 5′-Truncated mRNA Ends Upstream of miRNA-Guided Cleavage Sites
Plant miRNAs can guide cleavage in the middle of target sites, resulting in truncated mRNA fragments (Rhoades et al., 2002). However, growing evidence suggests that plant miRNA can repress target translation (Brodersen et al., 2008; Iwakawa and Tomari, 2013; Li et al., 2013b, 2013c; Liu et al., 2013), although the underlying mechanism is not well characterized. Because our analysis of the RNA degradome revealed the footprints of stalled ribosomes in uORFs and the CDSs, we predicted that a similar analysis of miRNA target genes would be useful for elucidating whether the binding of plant miRNAs can directly block the movement of ribosomes in planta. Therefore, we investigated the distribution of PARE reads in a 55-nucleotide region upstream of putative miRNA-guided cleavage sites in Arabidopsis and rice. Unlike the predominant accumulation of PARE reads at positions 16 and 46 nucleotides upstream of uORF stop codons (Figure 2), no position-specific enrichment was found in this region except at the putative miRNA-guided cleavage sites (Figure 7). This result thus suggests that the major mechanism by which plant miRNAs repress target translation may not be by acting as physical barriers to hinder the movement of ribosomes.
No Site-Specific Enrichment of 5′-Truncated RNA Ends Is Detected in the Proximal Region Upstream of miRNA-Guided Cleavage Sites.
Clustered heat maps of 5′-truncated RNA ends mapped to a 55-nucleotide region upstream of miRNA-guided cleavage sites in seedlings and inflorescences of Arabidopsis and rice. The Arabidopsis PARE data plotted were generated by this study, while the rice PARE data plotted were retrieved from the data sets published by Li et al. (2010) and Zhou et al. (2010). The presumed miRNA-guided cleavage site is assigned position 0 and the color represents the value of Peak_Index, which is calculated by dividing the reads starting at the position indicated by the total reads in a 31-nucleotide flanking region. The numbers of known miRNA target sites (indicated in parentheses) and target sites possessing PARE reads and included in heat maps are shown above heat maps.
Discovery of Potential Footprints of Ribosomes Hindered by AGO7 in Arabidopsis TAS3
Surprisingly, three Arabidopsis TAS3 genes shared a highly similar but unusual pattern of truncated 5′ RNA ends upstream of the noncleavable target site of miR390 (Figure 8A). A PARE peak was located immediately or four nucleotides upstream of the first base of noncleavable target sites of miR390. In the proximal region upstream of this peak, there were three additional PARE peaks arranged at regular intervals of 28 to 30 nucleotides. This regular degradation pattern in TAS3 highly resembled that detected in bZIP CPuORFs (Figure 2). Therefore, we assumed that the peak adjacent to the noncleavable miRNA site might be the footprint of AGO7, which loads miR390, whereas the upstream peaks likely also represented ribosome footprints. Intriguingly, TAS3 was annotated to be noncoding RNA because of lack of a long ORF and the production of conserved trans-acting siRNAs (tasiRNAs) when targeted by miR390 (Allen et al., 2005).
Arabidopsis TAS3 Genes Accumulate AGO7-Dependent but RDR6-Independent Phased 5′-Truncated RNA Ends Upstream of Noncleavable miR390 Target Sites.
(A) The distribution of 5′ ends of truncated RNA generated by PARE and ribosome-protected mRNA fragments generated by Ribo-Seq in three Arabidopsis TAS3 genes. The PARE data of seedlings plotted were generated by this study and the Ribo-Seq data of light-treated seedlings plotted were retrieved from the data set published by Liu et al. (2013). The predominant PARE and Ribo-Seq peaks are highlighted in red and marked with their distances to the first nucleotides of noncleavable miR390 target sites. TP50M, tags per 50 million.
(B) Comparison of 5′-truncated RNA ends generated in the region upstream of the noncleavable miR390 target site of TAS3a in wild type, rdr6, and ago7 by the modified RLM 5′ RACE assay. The bracket indicates the PCR products excised and cloned for Sanger sequencing (left panel). In total, 12 and 13 clones were sequenced for the wild type and rdr6, respectively. The positional distribution of 5′-truncated RNA ends revealed by the modified RLM 5′ RACE assay is plotted relative to the non-cleavable miR390 target site (right panel). The cleavage target of miR159, MYB65, is used as a positive control for the modified RLM 5′ RACE assay. In (A) and (B), the first nucleotide of the noncleavable miR390 target site is assigned position 0.
(C) Comparison of TAS3a transcript levels in the wild type, rdr6, and ago7 by qRT-PCR. The mRNA level of UBQ5 was used for the normalization of TAS3a mRNA levels. Each bar represents the mean of measurements derived from four biological replicates ± se relative to the measurement of the wild type. *P < 0.05; **P < 0.01 (two-tailed Student’s t test, n = 4).
(D) Comparison of TAS3a tasiRNA produced in the wild type, rdr6, and ago7 by RNA gel blot with U6 as the loading control. Numbers to the left of the blot show sizes in nucleotides. Antisense DNA oligonucleotides were used as probes for U6 and a TAS3 tasiRNA.
To determine whether TAS3 might serve as a template for translation, we first predicted ORFs in three Arabidopsis TAS3 genes. All three TAS3 genes have an ORF that terminates 7 to 11 nucleotides upstream of the noncleavable site of miR390 (Figure 8A). The coding nature of the predicted ORFs in three TAS3 genes was evaluated with Ribo-Seq data published previously (Liu et al., 2013; Juntawong et al., 2014). The predicted ORFs in TAS3a and TAS3b have relatively dense ribosome-protected fragments compared with other regions, supporting the coding ability (Figure 8A; Supplemental Figure 11). Moreover, the major peaks revealed by Ribo-Seq were close to the predominant PARE peaks in TAS3a and TAS3b. This result suggested that translated ORFs were present upstream of the noncleavable target site of miR390 and 5′-truncated RNA ends mapped to this region were likely ribosome footprints. If these PARE peaks represent the footprints of ribosomes hindered by miR390-AGO7 complex but not the cleavage remnants directed by tasiRNAs, they should disappear in the ago7 but be sustained in the mutant of RNA-DEPENDENT RNA POLYMERASE6 (RDR6), which acts downstream of AGO7 in the tasiRNA biogenesis pathway (Mallory and Vaucheret, 2010). As predicted, the result of modified RLM 5′ RACE clearly showed that degradation fragments with 5′ ends mapped to these positions were barely detected in ago7 (Figure 8B). On the other hand, the amounts of these degradation fragments were comparable between wild-type and rdr6 plants, although both ago7 and rdr6 showed increased amounts of TAS3a transcripts compared with the wild type and no tasiRNA production (Figures 8B to 8D). The PARE data of wild-type, rdr6, and ago7 inflorescences showed the same result as the modified RLM 5′ RACE assay. All PARE peaks in the ORFs of three TAS3 genes vanished in ago7 except the one at position −58 in TAS3b (Supplemental Figure 12). The result of mutant analysis therefore indicates that the formation of these 5′-truncated RNA ends upstream of the noncleavable target site of miR390 depends on AGO7 but is independent of tasiRNA. Taken together, these results imply that noncleavable targeting of AGO7 may arrest ribosomes as a road block in planta. Because the last major PARE peaks in the ORFs in TAS3a, TAS3b, and TAS3c are located 19, 14, and 16 nucleotides upstream of the first nucleotide of the stop codons respectively, the miR390-AGO7 complex appears to arrest a ribosome in the step of elongation on TAS3a but in the step of termination on TAS3b and TAS3c.
DISCUSSION
The RNA degradome is composed of degradation products of endoribonucleases and XRNs acting through diverse pathways. Here, we not only demonstrate that ribosome-protected mRNA ends are pervasive in the RNA degradome, but also demonstrate exciting new applications of RNA degradome data in the study of posttranscriptional gene regulation. Predominant accumulation of 5′-truncated mRNA ends in a 30-nucleotide interval likely represents an array of stacked ribosomes in the uORF, CDS, or the upstream region of noncleavable binding sites of AGO7 (Figure 9). Genome-wide analysis of 5′-truncated mRNA ends mapped to uORFs suggests that many CPuORFs may repress downstream ORF expression by stalling ribosomes at CPuORF stop codons, whereas the majority of predicted uORFs lack this ability (Figure 3). In addition, the analysis of 5′-truncated mRNA ends occurring upstream of miRNA target sites uncovered the signature of ribosome stacking in the noncleavable target site of miR390 but not other cleavable target sites (Figures 7 and 8). Novel regulatory uORFs and CDS regions with potential to cause ribosome stalling were identified through the analysis of the RNA degradome (Figures 4 to 6, Table 1).
Schematic Representation of RNA Degradation Fragments Protected by Stacked Ribosomes in Distinct Regions.
(A) Degradation signatures caused by stacked ribosomes upstream of uORF stop codons. Position 0 is the first nucleotide of the uORF stop codon.
(B) Degradation signatures caused by stacked ribosomes in the CDS. Position 0 is the 3′ edge of the most 3′ stalled ribosome.
(C) Degradation signatures caused by stacked ribosomes upstream of a noncleavable miRNA target site bound by AGO7. Position 0 is the first nucleotide of the miRNA target site.
Predominant peaks of 5′-truncated RNA ends are indicated with vertical lines and their distances to the putative sites causing ribosome stalling (highlighted in red) or to the 3′ edge of the most 3′ stalled ribosome (highlighted in blue) are marked below. The height of vertical lines represents the abundance of 5′-truncated RNA ends. 60S, 60S subunit of ribosome; 40S, 40S subunit of ribosome.
Comparison of Ribosome Footprints in RNA Degradome Data and Ribo-Seq Data
Accumulation of degradation intermediates starting at the 5′ edge or the A site of stalled ribosomes has been reported in bacteria, yeast, and Arabidopsis (Hayes and Sauer, 2003; Sunohara et al., 2004; Doma and Parker, 2006; Haraguchi et al., 2008), and it has been proposed that endoribonucleases are involved in the production of these degradation fragments. However, a recent genome-wide survey of monophosphorylated 5′ RNA ends in yeast showed that XRN1 is also involved in the accumulation of 5′-truncated RNA ends protected by ribosomes during RNA decay (Pelechano et al., 2015). Using the plant degradome data, we identified an array of 5′-truncated RNA ends separated at 30-nucleotide intervals in uORFs and CDSs, implying an array of stacked ribosomes (Figures 2 and 4 to 6). However, the 30-nucleotide phasing pattern was not observed in Ribo-Seq data for the regions we identified from RNA degradome analysis. The discrepancy between Ribo-Seq data and RNA degradome data is likely due to the different procedures of library preparation, although both methods could capture ribosome-protected mRNA fragments. A typical Ribo-Seq protocol includes a step to select fragments protected by a single ribosome but not stacked disomes or trisomes (Ingolia, 2010). Therefore, a region associated with stacked ribosomes may rather show a low density of protected fragments in Ribo-Seq data. On the other hand, degradome sequencing, PARE, genome-wide mapping of uncapped transcripts, or 5Pseq profiles truncated RNA termini without monosome selection and can thus capture fragments protected by stacked ribosomes. In addition, in the absence of monosome selection and translation inhibitor treatment to block ribosomes on mRNA, these methods may also allow the identification of fragments protected by ribosomes on distinct translation states.
A previous global study of yeast ribosome footprints demonstrated that ribosome-protected fragments fall into two major groups (28 to 30 nucleotides long and 20 to 22 nucleotides long) that are stabilized by different translation inhibitors (Lareau et al., 2014). The size of the protected fragments reflects the configuration and translation state of the ribosomes. In the three Arabidopsis TAS3 genes, we detected PARE peaks arranged at 24- to 30-nucleotide intervals immediately upstream of the first base of noncleavable miR390 target sites (Figure 8A). The shorter intervals between two PARE peaks may imply a ribosome pausing at the translocation stage instead of the decoding stage, which should lead to a 30-nucleotide protected fragment in Arabidopsis. In addition to technical differences, the interplay between ribosome stalling and RNA decay likely enhances the signal of stalled ribosomes in RNA degradome data. By contrast, the signal of stalled ribosomes might be embedded in the footprints of active ribosomes in Ribo-Seq data. In summary, the use of degradome data for the identification of regions associated with stacked ribosomes may outperform the use of Ribo-Seq data in some cases.
Investigation of Regulatory uORFs Using the RNA Degradome
Comprehensive identification of regulatory uORFs has been challenging as predicted uORFs may not be translated or regulate downstream ORF expression through distinct mechanisms (Barbosa et al., 2013). Prior to the development of Ribo-Seq, translated uORFs with regulatory functions were often identified by mutant screening or evolutionary conservation (Hill and Morris, 1993; Delbecq et al., 1994; Wiese et al., 2004; Imai et al., 2006; Hayden and Jorgensen, 2007). Although Ribo-Seq data have been used in the identification of translated uORFs (Fritsch et al., 2012; Liu et al., 2013; Ingolia et al., 2014), the application of Ribo-Seq data to the identification of ribosome stalling uORFs has not been reported. In this study, we show that some CPuORFs overaccumulate 5′-truncated RNA ends with a signature of stacked ribosomes (Figure 2A). We further demonstrate that the accumulation of 5′-truncated RNA ends in bZIP CPuORFs is enhanced in response to a high concentration of sucrose (Figure 2B). The data thus support the previous hypothesis that the conserved peptides encoded by these bZIP CPuORFs can stall ribosomes at stop codons in response to sucrose concentration (Wiese et al., 2004) and suggest the application of degradome data in the study of regulatory uORFs. Through the global analysis of 5′-truncated RNA ends occurring in predicted uORFs, we identified novel ribosome stalling uORFs in Arabidopsis. Notably, uORFs in MYB34 and MYB51 are conserved and specific in the Brassicaceae family (Figure 5B). Since RNA degradome data of diverse plant species are available in the public domain, these data sets might be very useful for the identification of lineage or species specific ribosome stalling uORFs as evolutionary conservation is not required for this type of analysis.
Analysis of the RNA Degradome Allows Dissection of Ribosome Pausing
Besides uORFs, CDS regions also possess 30-nucleotide phased 5′-truncated mRNA ends (Figure 6). Some regions likely encode nascent peptides which can stall ribosomes during translation elongation as the MTO1 region in CGS1 or are upstream of structured regions that can block the movement of ribosomes (Onouchi et al., 2005). The one with the last peak of 5′-truncated RNA ends located 16 nucleotides upstream of main ORF stop codons as in the degradation pattern observed in CPuORFs may cause ribosome pausing during translation termination. A previous study showed that the binding of EUKARYOTIC RELEASE FACTOR 1 (eRF1) to three types of stop codons causes a conformational change in the ribosome and a 2-nucleotide shift toward the 3′ end of mRNA (Kryuchkova et al., 2013). Because the position 17 nucleotides upstream of the main ORF stop codon is in frame with the 3-nucleotide periodicity observed in the CDSs (Figure 1A), the specific accumulation of 5′-truncated mRNA ends at this position suggests the pausing of a ribosome with the A site at the stop codon. On the other hand, the 5′-truncated mRNA ends peak 16 nucleotides upstream of the stop codon may imply the stalling of ribosomes after the conformational change of ribosomes induced by the binding of eRF1. The integration of the RNA degradome data, protein sequences, and RNA structures will bring new insights into the regulation of ribosome pausing.
Exploration of Plant miRNA-Mediated Translational Repression with RNA Degradome
Unlike animal miRNAs, which mainly target the 3′ UTR, the majority of plant miRNAs bind to their target CDSs through nearly perfect base pairing. Although cleavage sites in the middle of plant miRNA target sites have been extensively validated by degradome data in many species, translational repression mediated by plant miRNAs has also been demonstrated through multiple approaches. Previous analysis of Arabidopsis Ribo-Seq data showed that miRNA targets have lower translational efficiency compared with non-miRNA targets (Liu et al., 2013). However, no preferential accumulation of ribosome footprints was observed in the region upstream of miRNA target sites in Arabidopsis Ribo-Seq data (Liu et al., 2013). Because Ribo-Seq has some drawbacks in detecting stacked ribosomes or ribosomes pausing at the translocation stage as we have discussed, we used RNA degradome data to reexamine whether plant miRNAs can directly stall ribosomes. Consistent with the previous observation in Ribo-Seq data (Liu et al., 2013), no site-specific enrichment of 5′-truncated RNA ends was found in the region upstream of Arabidopsis and rice miRNA target sites (Figure 7). This strengthens the notion that directly blocking ribosome movement through the binding of RNA-induced silencing complex contributes little to plant miRNA-mediated translational repression in planta. Nevertheless, the binding of AGO7 to a well characterized noncleavable target site of miR390 seems to hinder the movement of ribosomes and cause ribosome stacking as the upstream regions show signatures of stacked ribosomes in the same way as bZIP CPuORFs (Figures 2 and 8). The same degradation pattern corresponding to ribosome stacking among the three Arabidopsis TAS3 genes suggests that this unique configuration of ORFs and noncleavable miR390 target sites might be crucial for tasiRNA production. Although the function of ORFs upstream of the miR390 noncleavable site has not been tested, positioning a target site of miR173, another well-known tasiRNA trigger bound to AGO1 (Cuperus et al., 2010), within 10 nucleotides of a stop codon of an upstream ORF was shown to enhance the production of artificial tasiRNA (Zhang et al., 2012). Moreover, a recently published article also demonstrated that an ORF surrounding the miR173 target site on TAS2 is translated and plays a crucial role in tasiRNA production (Yoshikawa et al., 2016). That signatures of stacked ribosomes exist close to the miR390 noncleavable target site revealed in this study strongly suggests that translation contributes to tasiRNA biogenesis through a conserved mechanism regardless of miRNA triggers or AGO involved.
A bottleneck in the development of RNA degradome data applications is the complex composition of mRNA degradation intermediates, complicating interpretation. The discovery of ribosome footprints in RNA degradome data opens up the possibility of new applications of such data in posttranscriptional gene regulations beyond the validation of miRNA-guided cleavage. It should be possible to apply the RNA degradome data analyses demonstrated in this study to many other plant species that have RNA degradome data available, enabling deeper insights into ribosome stalling and mechanisms of RNA decay.
METHODS
Plant Materials and Growth Conditions
Arabidopsis thaliana (ecotype Col-0) used in this study was grown in soil or on 0.8% Bacto-agar plates containing half-strength Murashige and Skoog medium (pH 5.7) and 1% sucrose under a 16/8-h light/dark cycle with an irradiance of 50 to 90 μmol photons m–2 s–1 at 22°C. For the generation of PARE data, 11-d-old seedlings and inflorescences of wild-type Arabidopsis were used for total RNA isolation. For the analysis of 5′-truncated RNA ends generated from bZIP2 and bZIP11 by the modified RLM 5′ RACE assay, 10-d-old wild-type seedlings were transferred into liquid half-strength Murashige and Skoog medium with or without 6% sucrose. After incubation in growth chambers with rotary shaking at 40 rpm under constant light for 24 h, the seedlings were collected for RNA extraction. The inflorescences of Arabidopsis wild type, ago7 (GK-824A08-025510), and rdr6 (CS24285) were harvested for the analysis of 5′-truncated RNA ends generated from TAS3a with a modified RLM 5′ RACE assay and for PARE library construction.
PARE Library Construction and Sequencing
Total RNA isolated by PureLink Plant RNA Reagent (Thermo Fisher) and MaxTract high-density gel tubes (Qiagen) was used for PARE library construction following the protocol published previously (Zhai et al., 2014). PARE libraries were constructed with ∼80 μg total RNA and then sequenced on the Illumina HiSeq 2500 platform.
Modified RLM 5′ RACE Assay
Modified RLM 5′ RACE assay was performed to detect 5′-truncated RNA ends using GeneRacer Kit (Thermo Fisher). First, 2 to 3 μg of total RNA isolated by PureLink Plant RNA Reagent and MaxTract high-density gel tubes was ligated with the 5′ RNA adapter and then reversely transcribed with the oligo(dT) primer. Next, cDNA was used as the template for PCR analysis with a GeneRacer 5′ primer and a gene-specific primer. Nested PCR was performed with a GeneRacer 5′ nested primer and a gene-specific nested primer if no PCR products were detected in the primary PCR. Amplified products of expected size were gel purified, cloned into pJET1.2/blunt cloning vector or pCR4-TOPO TA vector (Thermo Fisher), and sequenced. A target of miR159, MYB65, was included as the positive control for the modified RLM 5′ RACE assay. The primers are listed in Supplemental Table 1.
Analysis of 5′-Truncated RNA End Distribution
In addition to the in-house PARE data of Arabidopsis, previously published PARE data of Arabidopsis, rice (Oryza sativa), and soybean (Glycine max) downloaded from the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) were also analyzed in this study. Accession numbers are given at the end of Methods. Trimmed reads were mapped to the corresponding genomes or gene sequences downloaded from TAIR (https://www.arabidopsis.org/; TAIR 10), the MSU Rice Genome Annotation Project (http://rice.plantbiology.msu.edu/; release 6.1), and Phytozome (https://phytozome.jgi.doe.gov/; Phytozome v11.0: Gmax_275_Wm82.a2.v1) with Bowtie (http://bowtie-bio.sourceforge.net/; v1.0.0). Only perfectly matched 20-nucleotide sequences were used in the following metagene analyses and sequences of low complexity (repeats ≥ 15 nucleotides) or with ≥10 hits in the genome were excluded. The abundance of PARE sequences was assigned to the position corresponding to the first nucleotide of the sequence. Known Arabidopsis and rice CPuORF and miRNA target sites were retrieved from previous reports (Hayden and Jorgensen, 2007; Zheng et al., 2012).
For metagene analysis of PARE reads in the regions flanking the start codon and the stop codon of annotated CDSs, the abundance of PARE reads at each position in the defined regions on a transcript was first normalized by dividing the value by the sum of PARE reads starting in the defined region. Then, the relative abundance at each position across the defined region was calculated as the sum of normalized abundance of PARE reads starting at the same position for all genes. For heat maps of PARE read distribution in regions upstream of uORF stop codons or miRNA guided cleavage sites, we normalized the abundance of PARE reads at each position by dividing the value by the sum of PARE reads starting in a 31-nucleotide window flanking the indicated position. The distribution of normalized PARE abundance was then clustered using Ward’s method with R package (https://www.r-project.org/; version 2.15.2) and displayed as heat maps.
Ribo-Seq Data Analysis
Previously published Ribo-Seq data sets of Arabidopsis were downloaded from the GEO database. Accession numbers are given at the end of Methods. Trimmed reads of length ≥20 nucleotides were mapped to gene sequences downloaded from TAIR with Bowtie. Two-nucleotide mismatches were allowed for mapping Ribo-Seq reads of normoxia seedlings, whereas perfect matches were used in the mapping of Ribo-Seq reads of light-treated seedlings. The abundance of Ribo-Seq sequences was assigned to the position corresponding to the first nucleotide of the sequence.
Identification of uORFs with Footprints of Stacked Ribosomes
RNA ends protected by monosome and disome pausing at uORF stop codons should result in PARE peaks at positions 16 and 46 nucleotides upstream of the stop codon. A custom Perl script was used to identify uORFs possessing this degradation signature with in-house Arabidopsis PARE data (Supplemental Script 1). First, the sequences of Arabidopsis 5′ UTRs based on TAIR10 annotation were used to predict uORFs by looking for an ATG codon paired with the nearest in-frame stop codon. Next, PARE reads were mapped to the predicted uORFs by this Perl script. Candidates of uORFs with a signature of ribosome stacking at the stop codon were selected based on the following criteria. To capture the signature of ribosome stacking, the distribution of PARE reads was evaluated in two 31-nucleotide regions upstream of the stop codon. The first region spans positions −1 to −31 and the second region spans positions −32 to −62, with the first nucleotide of the uORF stop codon is set to 0. In the first region, the most abundant peak (the first major peak) was required to be at position −16 or −17 and with the number of raw reads ≥3. In the second region, the most abundant peak (the second major peak) was required to be located at position −45, −46, or −47. Moreover, the abundance of the major peaks was required to be at least 2-fold higher than that of the second most abundant peak in the same region but outside the possible positions for major peaks.
Identification of Degradation Signatures Representing Stacked Ribosomes in the CDSs
To identify degradation signatures representing stacked ribosomes in the CDSs, we first mapped trimmed PARE sequences to Arabidopsis CDSs based on TAIR10 annotation with Bowtie. Because the protected degradation termini caused by three stacked ribosomes are three predominant PARE peaks separated by ∼30 nucleotides, we calculated two values, Peak_Abundance and Peak_Index, to evaluate the predominance of PARE reads accumulated at each position. Peak_Abundance was defined as the sum of PARE reads starting at the indicated position together with the positions one nucleotide upstream and downstream. Peak_Index was calculated by dividing the Peak_Abundance by the total PARE reads starting in a 31-nucleotide window flanking the indicated position. Positions with Peak_Index ≥ 0.3 were selected as the first major peak and were reset to 0 in further analysis. The second and third major peaks were required to be at the regions between −29 and −31, and −59 and −61 with Peak_Index ≥ 0.3. In addition, the abundance of three major peaks had to be the highest in the region for the calculation of Peak_Index and with raw reads ≥ 3. The abundance of the three major peaks also needed to be 2-fold higher than that of the second most abundant peak falling in the 31-nucleotide window but outside the possible positions for major peaks. To ensure that the degradation signature caused by stacked ribosomes was prominent among all degradation events within a gene, a region was selected only if all three major peaks were ranked in the top 1% of all positions with regard to PARE abundance. To eliminate the phased regions occurring by chance, the candidates uncovered from only a single sample were removed. The in-house Perl script for PARE peak analysis in CDSs is provided as Supplemental Script 2. The same criteria were applied for the phasing analysis of PARE peaks for intervals from 20 to 40 nucleotides.
Generation of Transgenic Lines
A 1.2-kb DNA fragment of Arabidopsis CIPK6 promoter region plus 5′ UTR was cloned into gateway vector pHGWFS7 upstream a GUS reporter gene through Gateway LR Clonase II Enzyme mix (Thermo Fisher). Site-directed mutagenesis was applied to change the start codon of CIPK6 CPuORF to a stop codon. The two constructs with wild-type uORF and ΔuORF were transformed into wild-type Arabidopsis through the floral dip method (Zhang et al., 2006). Primers used to clone the 1.2kb CIPK6 promoter fragment are listed in Supplemental Table 1.
GUS Activity Assay and GUS Staining
The extract of ground tissues extracted by GUS extraction buffer (50 mM NaHPO4, pH 7.0, 10 mM 2-mercaptoethanol, 10 mM Na2EDTA, 0.1% sodium lauryl sarcosine, and 0.1% Triton X-100) was mixed with MUG buffer (4-methylumbellifery β-d-glucuronide) at 37°C for 20 min, and then the reaction was terminated by adding stop buffer (0.2 M sodium carbonate). The fluorescence intensity of 4-methylumbelliferone was measured by fluorometer at 450 nm when excited at 365 nm. The amount of 4-methylumbelliferone was calculated with the standard curve and then normalized to the amount of total protein which was measured using the Bradford dye binding method with the Bio-Rad Labs protein assay kit. The GUS transgenic plants were stained in GUS staining solution [0.1 M NaPO4, pH 7.0, 10 mM EDTA, 0.1% Triton X-100, 1 mM K3Fe(CN)6, and 2 mM X-Gluc] at 37°C overnight. After staining, samples were washed with 50% ethanol until chlorophyll was removed, which took either overnight or several days depending on the tissue. Then, the stained samples were observed under a stereomicroscope (SteREO Lumar.V12; Zeiss) and photographed with a digital camera (AxioCam MRc; Zeiss).
Quantitative RT-PCR
Two micrograms of total RNA was used as a template for reverse transcription with ToolsQuant II Fast RT kit (Biotools). The resulting cDNA was diluted 20-fold and 5 μL was used for qRT-PCR in a 20 μL reaction with SYBR Green PCR Master Mix (Applied Biosystems) on a 7500 Fast Real-Time PCR System (Applied Biosystems) using the following program: 20 s at 95°C, followed by 40 cycles of 3 s at 95°C, and 30 s at 60°C, with an additional melt curve stage consisting of 15 s at 95°C, 1 min at 60°C, and 15 s at 95°C. TAS3a and GUS expression levels were normalized to the level of UBQ5 expression and were averaged from at least three independent biological samples, followed by normalizing to the corresponding value of the wild type. The primers used are listed in Supplemental Table 1.
Protoplast Transient Assay
For construction of reporter plasmids, MYB34 and MYB51 5′ UTRs were amplified from Arabidopsis genomic DNA and cloned into pJD301 between the cauliflower mosaic virus 35S promoter and the firefly luciferase (LUC) coding region using the NcoI restriction site. To abolish MYB34 and MYB51 uORFs, the start codon of uORFs was converted into a stop codon by site-directed mutagenesis. Primers used in the cloning of MYB34 and MYB51 5′ UTR sequences are listed in Supplemental Table 1. Arabidopsis mesophyll protoplasts were isolated from 3- to 4-week-old rosette leaves following the method described previously (Wu et al., 2009). Equal amounts (20 µg) of reporter plasmids and internal control plasmids containing the 35S promoter driving a GUS gene were cotransfected into 105 protoplasts in a PEG solution (40% polyethylene glycol 4000, 0.2 M mannitol, and 0.1 M CaCl2) at room temperature for 5 to 10 min. The transfected protoplasts were incubated at 22°C in the dark for 16 h and then lysed with Cell Culture Lysis Reagent (Promega). LUC activity was measured with a Luciferase Assay System (Promega) according to the manufacturer’s instructions and normalized to GUS activity, which was measured as described above.
RNA Gel Blot of Small RNA
Small RNA gel blot analysis was performed as described previously (Lee et al., 2015). The probes used in the detection of small RNAs derived from TAS3 and U6 are listed in Supplemental Table 1.
Alignment of Peptides Encoded by MYB34 and MYB51 uORFs
The sequences of MYB34 and MYB51 genes in different plant species were identified by a BLASTP search and downloaded from the Phytozome database (https://phytozome.jgi.doe.gov). The 5′ UTR sequences of orthologous genes retrieved for uORF analysis with Serial Cloner (http://serialbasics.free.fr/; version 2.6) are listed in Supplemental Data Set 1. Peptides encoded by MYB34 and MYB51 uORF were then aligned with the use of Vector NTI software (Thermo Fisher).
Accession Numbers
The PARE data generated in this study as well as public PARE data are available in the GEO database under series GSE77549 (Arabidopsis in-house PARE data) and accession numbers GSM280226 (public Arabidopsis inflorescence PARE data), GSM647200 (public soybean seed PARE data), GSM434596 (public rice seedling PARE data), and GSM476257 (public rice young inflorescence PARE data). The Arabidopsis Ribo-Seq data sets used in this study are available in the GEO database under the accession numbers GSM1226369 (light-treated seedlings) and GSM1224475 (normoxia seedlings). Sequences of individual genes used in PARE data analysis or functional assays and mutants used can be found in the TAIR or Phytozome databases under the following accession numbers: AT3G02470 for Ath-SAMDC, AT2G18160 for Ath-bZIP2, AT4G34590 for Ath-bZIP11, AT3G62420 for Ath-bZIP53, AT4G30960 for Ath-CIPK6, Glyma09g14090 for Gly-CIPK6, AT5G60890 for Ath-MYB34, AT1G18570 for Ath-MYB51, AT3G01120 for Ath-CGS1, AT3G17185 for Ath-TAS3a, AT5G49615 for Ath-TAS3b, AT5G57735 for Ath-TAS3c, AT1G69440 for Ath-ago7, and AT3G49500 for Ath-rdr6.
Supplemental Data
Supplemental Figure 1. 5′-Truncated RNA Ends Show a 3-Nucleotide Periodicity and Significant Frame Bias in the CDS.
Supplemental Figure 2. Overaccumulation of 5′-Truncated RNA Ends at the 3′ End of Arabidopsis SAMDC uORF.
Supplemental Figure 3. The Comparison of PARE and Ribo-Seq Read Distribution in Arabidopsis bZIP uORFs.
Supplemental Figure 4. Site-Specific Enrichment of 5′-Truncated RNA Ends in Arabidopsis CPuORFs.
Supplemental Figure 5. PARE Read Distribution in Arabidopsis uORFs with RNA Degradation Signatures Representing Ribosome Stacking at Stop Codons.
Supplemental Figure 6. The Comparison of PARE and Ribo-Seq Read Distribution in Arabidopsis CIPK6, MYB34, and MYB51 uORFs.
Supplemental Figure 7. Negative Regulation of CIPK6 uORF in Reporter Gene Expression in Various Tissues.
Supplemental Figure 8. The Comparison of PARE Read Distribution with the 5′ Ends of the Degradation Intermediates Previously Identified around the MTO1 Region of CGS1.
Supplemental Figure 9. Phasing Analysis of PARE Peaks in the CDS.
Supplemental Figure 10. The Comparison of PARE and Ribo-Seq Read Distribution in CDS Regions with 30-Nucleotide Phased PARE Peaks.
Supplemental Figure 11. The Comparison of PARE and Ribo-Seq Read Distribution in Arabidopsis TAS3 Genes.
Supplemental Figure 12. Three Arabidopsis TAS3 Genes Accumulate AGO7-Dependent but RDR6-Independent Phased 5′-Truncated RNA Ends Upstream of Noncleavable miR390 Target Sites.
Supplemental Table 1. Sequences of Primers for Cloning, Modified RLM 5′ RACE, qRT-PCR, and Probes for RNA Gel Blot Analysis.
Supplemental Data Set 1. Sequences for the Analysis of Conserved uORFs in MYB34 and MYB51.
Supplemental Script 1. In-House Perl Script for PARE Peak Analysis in uORF.
Supplemental Script 2. In-House Perl Script for PARE Peak Analysis in CDSs.
Acknowledgments
We thank Shu-Hsing Wu and Tzyy-Jen Chiou of Academia Sinica for helpful discussions and Ming-Che Shih for supporting Hsiao-Chun Chou. We also thank the Academia Sinica Agricultural and Biotechnology Research Center core facilities for help with transgenic plants and protoplast assays and Miranda Loney for English editing of this article. This work was supported by Academia Sinica.
AUTHOR CONTRIBUTIONS
H.-M.C. designed the research. C.-Y.H., H.-C.C., and H.-M.C. performed the sequence data analyses. W.-C.L., H.-C.C., A.-P.C., and S.-J.C. carried out experiments. C.-Y.H., W.-C.L., and H.-M.C. wrote the article. All authors read and approved the final manuscript.
Footnotes
The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: Ho-Ming Chen (homing{at}gate.sinica.edu.tw).
[OPEN] Articles can be viewed without a subscription.
Glossary
- XRN
- exoribonuclease
- NMD
- nonsense-mediated mRNA decay
- PTC
- premature termination codon
- ORF
- open reading frame
- UTR
- untranslated region
- uORF
- upstream open reading frame
- miRNA
- microRNA
- PARE
- parallel analysis of RNA ends
- 5Pseq
- 5′ P sequencing
- CDS
- coding sequence
- CPuORF
- conserved peptide uORF
- AdoMet
- S-adenosyl-l-methionine
- tasiRNA
- trans-acting siRNA
- GEO
- Gene Expression Omnibus
- Received April 12, 2016.
- Revised September 14, 2016.
- Accepted October 13, 2016.
- Published October 14, 2016.