This protocol provides an easy-to-follow workflow to conduct poly(A) RNA purification, bisulfite conversion, and library preparation using standardized equipment for a biological sample of interest.
RNA post-transcriptional modifications in various types of RNA transcripts are associated with diverse RNA regulation in eukaryotic cells. Aberrant RNA 5-methylcytosine modifications and the dysregulated expression of RNA methyltransferases have been shown to be associated with various diseases, including cancers. Transcriptome-wide bisulfite-sequencing was developed to characterize the positions and the quantitative cytosine methylation levels in the bisulfite-converted RNA at the base-pair resolution. Herein, this protocol presents the procedures of two rounds of poly(A) RNA purification, three cycles of bisulfite reaction, and library preparation in detail to allow the transcriptome-wide mapping of mRNA 5-methylcytosine modification sites. The assessment of RNA quantity and quality after the main reaction is essential to monitor RNA integrity and is a critical step for ensuring high-quality sequencing libraries. Ideally, the procedures can be completed within three days. With this protocol, using high-quality total RNA as the input can practically build up robust bisulfite-mRNA libraries for next-generation sequencing from the sample of interest.
Among over 150 types of post-transcriptional modifications1, 5-methylcytosine (m5C) modification has been identified in various types of RNAs, including ribosomal RNA, transfer RNA, messenger RNA, micro RNA, long non-coding RNA, vault RNA, enhancer RNA, and small cajal body-specific RNAs2. The RNA m5C is associated with diverse biological and pathological mechanisms such as regulating plant root development3, viral gene expression4, and cancer progression5. The aim of this protocol is to provide streamlined pipelines to characterize the transcriptome-wide mRNA m5C modification profile of biological samples in different developmental stages or in the disease setting. Transcriptome-wide bisulfite-sequencing was developed to characterize the positions and the quantitative cytosine methylation levels in the bisulfite-converted RNA at base-pair resolution6,7,8,9. This is particularly useful when studying the association of m5C with gene expression and RNA fate that is involved in the biological regulatory mechanisms in cells. In the mammalian cell, there are two known m5C readers: ALYREF can recognize m5C at the nucleus and serves as an mRNA nucleus-to-cytosol transporter10, while YBX1 can recognize m5C in the cytoplasm and increase mRNA stabilization11. Aberrant m5C mRNAs related to immune pathways were reported in Systemic Lupus Erythematosus CD4+ T cells12. Studies have revealed an association between mRNA m5C modification and modulation of cancer immunity and cancer progression13,14. Hence, mapping the m5C modification profile on the mRNA can provide crucial information to elucidate the potential regulatory machinery.
To investigate the functional roles of RNA m5C modification under certain biological conditions, the bisulfite conversion-based (bsRNA-seq) and antibody affinity enrichment-based approaches such as m5C-RIP-seq, miCLIP-seq, and 5-Aza-seq can be combined with the high-throughput sequencing platform to provide efficient detection of targeted regions and sequences with the m5C modifications on a transcriptome-wide scale15,16. The advantage of this protocol provides the comprehensive RNA m5C landscape at a single-base resolution since the antibody affinity enrichment-based approach relies upon the availability of high-quality antibodies and could achieve the single-fragment resolution of m5C methylation landscape17.
All RNA samples will be processed with two rounds of mRNA enrichment using oligo(dT) beads, three cycles of bisulfite reaction, and the sequencing library preparation. To monitor the RNA quality, each RNA sample will be examined by capillary gel electrophoresis before and after the procedures of mRNA purification and bisulfite reaction to assess fragment distribution. The purified libraries will be examined by their PCR amplicon qualities, DNA size distribution fragments by capillary gel electrophoresis, and their overall quantities examined by fluorescence dyes-based quantitative assays before sequencing. The system can also be used to analyze a broad spectrum of biological samples such as agricultural produce, isolated virions, cell lines, model organisms, and pathological specimens.
1. Poly(A) RNA purification
NOTE: Use the total RNA treated with DNase I and examine the total RNA quality and integrity by capillary or conventional gel electrophoresis assessment before proceeding to poly(A) RNA purification. Investigators should be able to identify the 28S and 18S rRNA ribosomal bands in the high molecular weight field and the 5.8S rRNA band in the low molecular weight field without any significant smear bands in the electropherogram. The purification steps essentially follow the manufacturer's instructions with minor modifications indicated in the specific steps. See the Table of Materials for details related to all materials and instruments used in this protocol.
2. Bisulfite conversion
NOTE: The centrifugation steps were all performed at room temperature. The procedures are essentially performed accordingly to the manufacturer's instructions but with an additional step 2.2 for adding the spike-in control mRNA sequence(s) before the bisulfite reaction step 2.3.
3. Bisulfite-treated mRNA library preparation
NOTE: Follow the library preparation instruction protocol section 4 for use with purified mRNA or rRNA-depleted RNA. The first priming step should follow the FFPE RNA protocol since the bisulfite treatment fragmentizes the RNA. Perform every step in the laminar flow hood and add the reaction mixture on an ice-chilled cooling rack..
A series of bsRNA-seq libraries from cell lines19 were generated by following the procedures in this report (Figure 1). After total RNA purification accompanied by DNase treatment performed on cell line samples and the quality checked by gel electrophoresis and UV-Vis spectrophotometry (A260/A280), the RNA sample can proceed to poly(A) RNA enrichment. To determine whether the double purification could remove the majority of ribosomal RNA, the purification efficiency of poly(A) RNA was assessed by capillary electrophoresis total RNA assay that can automatically calculate the rRNA contamination percentage (Figure 2). From the RNA fragment peak pattern and rRNA contamination percentage, RNA samples with twice purification indeed showed decreased contamination of ribosomal RNA from 6.5% to 2.2% and 2.6% to 1.1% in AsPC-1 and BxPC-3, respectively. With human pancreatic cancer cell lines, two rounds of beads purification could reach an average of 2.8 ± 1% poly(A)-enriched RNA abundances from the total RNA samples. Therefore, the poly(A) RNA enrichment by oligo(dT) beads with double purification steps minimized ribosomal RNA and represents a feasible mRNA sample enrichment for the downstream experiments.
The bisulfite conversion protocols had been reported in several studies on RNA 5-methylcytosine modification (Table 1). The bisulfite reaction could be performed with a user-reconstituted reaction mixture consisting of 40% sodium bisulfite and 600 mM hydroquinone in the aqueous solution (pH 5.1) at 75 °C for 4 hours10,20,21. Alternatively, a more stringent bisulfite treatment of RNA sample may also be performed with a commercial kit; which in a number of studies from others and ours6,22,23,24 extended the bisulfite reaction time to three incubation cycles. Since the three-cycle incubation step efficiently converts unmethylated cytosine in the in vitro transcribed and structurally more-folded Escherichia coli 16S rRNA segment8, the three-cycle incubation step was also applied in this protocol.
With twice poly(A) enrichment and the three-cycle bisulfite conversion, RNA quality before and after the reaction was assessed by capillary electrophoresis. The RNA size distribution after the bisulfite treatment showed a peak of ~200-500 nucleotides because of fragmentation caused by the bisulfite reaction (Figure 3). Additionally, the fragmented bsRNA is ideal for library preparation without the need to conduct another fragmentation step. A total of 8-10 ng of bsRNA as input was used to construct the library according to this protocol by using 9 to 11 cycles in the final PCR amplification of adapter-ligated DNA. The capillary electrophoresis of the amplicon showed quite a successful library preparation, with only a small portion of primer remaining and no peak of over-amplification (Figure 4). To check the conversion efficiency in each sample library, the unmethylated firefly luciferase RNA as a spike-in control sequence was added to the sample before conducting the bisulfite treatment. After the sequencing reads aligned to the reference sequence using meRanTK tool kits25, the average of 2,440 (0.006%) of total reads were mapped to the spike-in sequence, and the total analyzed C-to-T conversion rate reached an average of 99.81%; the status can be viewed in Integrated Genomics Viewer (Figure 5).
Figure 1: Workflow diagram of bisulfite mRNA sequencing. The pipeline constitutes of three main protocols, including mRNA enrichment, bisulfite conversion, and library preparation. Abbreviation: bsRNA = bisulfite mRNA; QC. = quality control. Please click here to view a larger version of this figure.
Figure 2: Quality control results of poly(A)-enriched RNA samples with capillary electrophoresis. (A) The representative capillary electrophoresis profile of the total RNA from BxPC-3 cells. (B) The representative capillary electrophoresis profiles of samples processed with oligo(dT) purification from AsPC-1 and BxPC-3 cells. mRNA E1, mRNA elution from one-time purification; mRNA E2, mRNA elution from two-times purification. The rRNA contamination estimates were calculated by the electrophoresis operation system that controls the total RNA analysis assay. (C) The capillary electrophoresis profile of the Ladder markers was obtained in the same run with samples presented in panel B. Please click here to view a larger version of this figure.
Figure 3: Quality assessment of RNA samples by capillary electrophoresis. The representative capillary electrophoresis profiles of (A) the total RNA, (B) the poly(A)-enriched mRNA from two times purification, and (C) the bisulfite-treated mRNA from the BxPC-3 cells. Abbreviations: B_empty = BxPC-3 cells transduced with empty vector; B_shScr = BxPC-3 cells transduced with the scramble sequences. Please click here to view a larger version of this figure.
Figure 4: Quality assessment of bsRNA-seq libraries. (A-C) Three sets of representative capillary electrophoresis profiles of constructed bsRNA-seq libraries amplified by PCR at different PCR cycle numbers of 9 and 11 cycles using bsRNA from a set of BxPC-3 cells. The estimated input quantity of the bisulfite-treated RNA was determined by a high-sensitivity RNA quantification assay. A total of 8, 8.4, and 9.6 ng bsRNA were used in the (A) BxPC-3 mock, (B) BxPC-3 empty, and (C) BxPC-3 shN2UN2 samples, respectively. The ladder marker peaks at 35 bp and the 10,380 bp represent the internal standards of the lower- and upper-boundary in the electrophoresis profile of each sample. A prominent peak of around 100 bp and an obscure peak of around 127 bp, respectively, indicate the PCR primers (<100 bp) and the adaptor-dimer (~127 bp), which remained in the eluate after PCR cleanup. Abbreviations: BxPC-3 mock = untransduced BxPC-3 cells; BxPC-3 empty = BxPC-3 cells transduced with the empty vector control; BxPC-3 shN2UN2 = BxPC-3 cells transduced with the shNSUN2 constructs. Please click here to view a larger version of this figure.
Figure 5: The sequence alignment profiles of each bsRNA-seq library mapped to the spike-in-control reference sequence. The representative alignment profiles of the BxPC-3 mock, empty, scramble, and shNSUN2 bsRNA-seq libraries to the firefly luciferase spike-in control reference sequence (indicated as the "Z" gene); 2 representative regions were shown. Grey bars indicate all reads that present matched consensus nucleotides at the indicated base position to the reference sequence. Red bars highlight the thymidine (T) identity at the base position of the reads; blue bars indicate those reads having the cytosine (C) identity at the position. Please click here to view a larger version of this figure.
Table 1: Comparison of bisulfite reaction protocols and the conversion rate. Please click here to download this Table.
In this protocol, a detailed pipeline of poly(A) enrichment, bisulfite conversion, and library preparation was achieved by utilizing standardized components. Further sequencing analysis provided the identification of mRNA 5-methylcytosine in samples of interest.
The critical step is the quality of starting material-total RNA-since the degradation of RNA would impact the recovery rate of poly(A) RNA purification. The sample should be carefully handled and RNase contamination avoided before conducting the poly(A) RNA purification step. Another crucial part of the procedure is the number of PCR cycles in library preparation. The decision of cycle number depends on the quantity of bisulfite-treated mRNA used for library preparation and whether the use of one size selection or double size selection in the step before PCR. The suitable cycle numbers for the sample of interest would thus require a trial run and assess the PCR product size distribution by capillary electrophoresis to determine the optimal cycle number for the same cell line or tissue.
In this protocol, two rounds of oligo(dT) beads purification were used to enrich and purify poly(A) RNA and eliminated a majority of ribosomal RNAs and other RNAs. There are other methods to remove ribosomal RNA and keep other RNA types in the sample, such as oligo-based depletion of rRNA26. Then, the 5-methylcytosine modification present in other RNA types can also be taken into analysis and extend the understanding of 5-methylcytosine features in mRNA and other RNAs.
Bisulfite reaction is known for not being able to distinguish m5C from other modifications, including the 5-hydroxymethylcytosine (hm5C), 3-methylcytidine (m3C), and N4-methylcytidine (m4C)27. However, the experimental design including m5C RNA methyltransferase knockdown or knockout should provide informative evidence of high-confidence regulated m5C sites. Additionally, validation using different methods such as antibody-based enrichment followed by sequencing or PCR or LC-MS/MS-based methods could essentially authenticate the candidate m5C sites16. The emerging technique of direct Nanopore RNA sequencing also provides potential identification of RNA modification via the computational analysis of the current signal or base-called 'error' features28,29,30.
A recent study by Johnson et al. showed that adding the fragmentation step in mitochondria RNA samples after three rounds of bisulfite conversion indeed showed a higher conversion rate and increased the yield of cDNA library product31. This paper specifically analyzed the conversion rate and reads quality mapped to the mitochondria genome, not the mRNA transcriptome. Hence, the use of the fragmentation step has been updated in Table 1 to highlight whether the published reports included an RNA fragmentation step so that the viewers can easily replicate the protocol. Another study by Zhang et al. demonstrated that utilizing in vitro transcribed modification-free RNA library as a negative control is efficient to eliminate the false-positive signal from bisulfite treatment32. Recent studies have compared different protocols used for library construction33. The user can further choose to conduct bsRNA-seq pipeline with or without an additional fragmentation step and in vitro transcribed modification-free RNA library to customize suitable workflow.
The transcriptome-wide identification of m5C by different principles could strengthen the findings of the modification landscapes and provide more understanding of the regulatory mechanisms of RNA modifications in healthy or disease models.
The authors have nothing to disclose.
This work was supported by the National Science and Technology Council of Taiwan. [NSTC 111-2314-B-006-003]
Agilent 2100 Electrophoresis Bioanalyzer System | Agilent, Santa Clara, CA | RNA quality detection | |
AMpure XP beads | Beckman Coulter | A63881 | purify DNA |
Bioanalyzer DNA high sensitivity kit | Agilent, Santa Clara, CA | 5067-4626 | DNA quality dection |
Bioanalyzer RNA 6000 Pico kit | Agilent, Santa Clara, CA | 5067-1513 | RNA quality dection |
DiaMag02 – magnetic rack | Diagenode, Denville, NJ | B04000001 | assist library preparation |
DiaMag1.5 – magnetic rack | Diagenode, Denville, NJ | B04000003 | assist poly(A) RNA purificaion |
Dynabeads mRNA DIRECT purification kit | Thermo Fisher Scientific, Waltham, MA | 61011 | poly(A) RNA purificaion; Wash Buffer 1 and Wash Buffer 2 |
Ethanol | J.T.Baker | 64-17-5 | |
EZ RNA methylation kit | Zymo, Irvine, CA | R5002 | bisulfite treatment |
Firefly luciferase mRNA | Promega, Madison, WI, USA | L4561 | spike in control seqeunce |
KAPA Library Quantification Kits | Roche, Switzerland | KK4824 | library quantification |
Nanodrop spectrophotometer | Thermo Fisher Scientific, Waltham, MA | Total RNA quantity detection | |
NEBNext multiplex Oligos for illumina (index Primer set1) | New England Biolabs, Ipswich, MA | E7335S | library preparation |
NEBNext Ultra Directional RNA Library Prep Kit for Illumina | New England Biolabs, Ipswich, MA | E7760S | library preparation |
Nuclease-free Water | Thermo Fisher Scientific | AM9932 | |
P2 pipetman | Thermo Fisher Scientific, Waltham, MA | 4641010 | |
Qubit 2.0 fluorometer | Thermo Fisher Scientific, Waltham, MA | RNA quantity detection | |
Qubit dsDNA HS Assay Kit | Thermo Fisher Scientific, Waltham, MA | Q32854 | DNA quantity detection |
Qubit RNA HS Assay Kit | Thermo Fisher Scientific, Waltham, MA | Q32852 | RNA quantity detection |