MicroRNAs (miRNAs) are a widely conserved class of regulatory molecules. Here we describe a miRNA cloning method that relies upon two potent ligation steps followed by high-throughput sequencing. Our method permits accurate genome-wide quantitation of miRNAs.
MiRNA cloning and high-throughput sequencing, termed miR-Seq, stands alone as a transcriptome-wide approach to quantify miRNAs with single nucleotide resolution. This technique captures miRNAs by attaching 3’ and 5’ oligonucleotide adapters to miRNA molecules and allows de novo miRNA discovery. Coupling with powerful next-generation sequencing platforms, miR-Seq has been instrumental in the study of miRNA biology. However, significant biases introduced by oligonucleotide ligation steps have prevented miR-Seq from being employed as an accurate quantitation tool. Previous studies demonstrate that biases in current miR-Seq methods often lead to inaccurate miRNA quantification with errors up to 1,000-fold for some miRNAs1,2. To resolve these biases imparted by RNA ligation, we have developed a small RNA ligation method that results in ligation efficiencies of over 95% for both 3’ and 5′ ligation steps. Benchmarking this improved library construction method using equimolar or differentially mixed synthetic miRNAs, consistently yields reads numbers with less than two-fold deviation from the expected value. Furthermore, this high-efficiency miR-Seq method permits accurate genome-wide miRNA profiling from in vivo total RNA samples2.
High-throughput sequencing based methodologies have been widely applied to many biological samples in recent years greatly expanding our understanding of the molecular complexity of biological systems3,4. However, preparation of RNA samples for high-throughput sequencing often imparts specific biases inherent to the employed methodology, limiting the potential utility of these powerful techniques. These method specific biases have been well documented for ligation-based, small-RNA library preparations1,2,5,6. These biases result in 1,000-fold variation in reads numbers for equimolar synthetic miRNAs, making inference of miRNA abundance from sequencing data wildly variable and error prone.
Studies focusing on the properties of phage-derived T4 RNA ligases have documented that the enzymes exhibit nucleotide-based preferences7, which manifest as biased libraries in high-throughput sequencing experiments1,2,8. In order to minimize the biases imparted by RNA ligases, multiple strategies have been employed; macromolecular crowding9, randomizing the nucleotide sequence on the adapter which is proximal to the ligation site6, and employing high concentrations of ligation adapter2. Through a combination of these three approaches we have developed a work-flow for unbiased preparation of small RNA libraries compatible for high-throughput sequencing (Figure 1). For direct comparisons between current protocols and our optimized method, please refer to our recent report2. This optimized method yields ligation efficiencies of greater than 95% at both 3’ and 5’ steps and permits the unbiased ligation of small RNA molecules from synthetic and biological samples2.
NOTE: It is critical to maintain RNase-free conditions during the entire procedure.
1. Adenylation of 3’ Linker
2. Linker Ligation to 3’ End of miRNA
3. Linker Ligation to 5’ End of miRNA-3’ Linker Hybrid
4. Reverse Transcription/cDNA Synthesis
5. Library Preparation
The anticipated results for the preceding method should initially be observation of a single nucleotide shift (increase) in the size of the DNA oligonucleotide that was subject to adenylation by Mth RNA ligase (Figure 2). Following 3’ ligation, visualization of the acrylamide gel indicates (see Figure 3) sharp high molecular weight bands evident in the 100-300 nucleotide region of the gel. This indicates that the total RNA sample being used is of high quality (not degraded). Secondly one should observe a very bright signal at the 25 nucleotide region of the gel, which is excess, unligated 3’ linker. It also may be helpful to include a no input RNA control in the 3’ ligation. This will indicate the purity of the 3’ linker, and also can be carried through to the PCR step for indication of the cycle number when non-specific signal becomes problematic. There is no diagnostic for the 5’ ligation, however shown in Figure 4 is a reaction identical to the one described carried out with radio-labeled RNA which indicates the importance of the high concentration of PEG8000 employed. Finally, after the PCR and native PAGE, a DNA product of 146 base pairs consistent with the size of the adapter sequences, a miRNA insert, and additional sequence from the extended PCR primers can be observed. It is important to note that if the proper number of PCR cycles (which is determined empirically) is exceeded, the no miRNA insert product may obscure the desired amplicon size. Shown in Figure 5 is a result from 5 pmoles of synthetic miRNA, typically for 2 μg of total RNA 13-18 cycles of PCR are necessary.
Figure 1. Schematic of the workflow for 2-step ligation small-RNA library preparation. Shown are general steps for preparation of an unbiased miR-Seq library. Shown in black is 3’ DNA ligation adapter, which is activated via adenylation in step 1, miRNA is shown in green and is ligated to the 3’ adapter in step 2, and 5’ RNA ligation adapter is in blue which is ligated to the chimeric molecule in step 3. Numbered steps are described in detail in the Protocol section, italics indicate enzymatic steps.
Figure 2. Adenylation of 3’ Linker with Mth RNA-Ligase. 18% PAGE-Urea gel from adenylation of DNA oligonucleotide to be used for subsequent 3’ ligation reaction, (+) indicates sample that was incubated with Mth RNA ligase, (-) indicates sample incubated without enzyme, and (m) indicates the size standards (numbers shown at right in base pairs). At left is a schematic of the oligonucleotide species present. Asterisk indicates larger DNA products generated by aberrant oligonucleotide synthesis.
Figure 3. 3’ Ligation of Synthetic and Total RNA Samples. A) 15% PAGE-Urea gel of 3’ ligation reactions, (m) indicates size standards (numbers at left indicate base pairs), (No RNA) indicates a reaction with no input RNA and 3’ linker that was not gel-purified, (+) indicates ligation performed with 5 pmol of synthetic miRNA and a 3’ linker which was gel-purified, (2 and 1 μg) indicate the amount of mouse total RNA subjected to 3’ ligation with the same, gel-purified linker, asterisk indicates excess 3’ linker. B) Autoradiogram of similar 3’ ligation reaction performed with P32 5’ end labeled synthetic miRNA. Number at top indicates time (hr) the reaction was allowed to proceed, and at bottom indicates the amount ligated as a percentage of the sum of unligated and ligated.
Figure 4. 5’ Ligation of Radio-labeled miRNA-3’Linker Hybrid. Autoradiogram of 5’ ligation reaction performed with P32 5’ end labeled synthetic miRNA-3’ linker hybrid. Number at top indicates amount of PEG used in the reaction, and number at bottom indicates the amount ligated as a percentage of the sum of unligated and ligated. Lines at left are a schematic of the ligated molecules where miRNA is in green, 5’ and 3’ adapters are blue and black, respectively.
Figure 5. PCR-derived small-RNA DNA Library. 8% Native PAGE of PCR products generated from miR-Seq ligation procedure. Numbers at top indicate cycles of PCR, numbers at side indicate molecular size standards. Each DNA species from the PCR is identified at right. Grid pattern is due to auto-fluorescence of specimen dish.
The methodology described herein makes use of several key variables to maximize ligation efficiencies, namely high concentrations of PEG, use of randomized linkers, and high concentration of linkers2,6,9. This approach permits reliably quantitative sequencing libraries from total RNA samples2. We have conducted multiple titrations of input RNA and have concluded that the preceding methodology is best suited for total RNA amounts in the 1-8 μg range (data not shown). When amounts in the 10-500 ng range are used, a majority of the read space is consumed by adapter concatamers and bacterial sequences from which the RNA ligases are purified. However, it is worth noting in these low input experiments that the miR species present, though low in reads number, are still reflective of actual amounts when compared to identical higher input samples. In order to ensure that the miR profiles observed from total RNA samples are reflective of actual amounts, and to allow cross-examination of different samples, we routinely include 10-fold dilutions of three synthetic calibrator RNAs12. We have found that the inclusion of 0.1:0.01:0.001 pmol of distinct calibrator RNAs yield useful reads numbers for confirming the quantitative nature of the assay.
As noted in Figure 1 (see asterisk) synthetic oligonucleotides frequently contain aberrant molecules which are distinct in size from the sequence ordered. If these products are able to co-purify with the miRNA-3’ linker hybrid, then it may be necessary to PAGE purify the 3’ linker following Mth adenylation. We have found this to be helpful for reducing the amount of non-specific PCR product observed in negative control samples.
Although the methodology described above was benchmarked for quantitative preparation of miRNA libraries2, we fully anticipate that it is readily applicable to more general procedures as well; namely RNA-seq and CLIP-Seq (Cross Link Immuno-Precipitation) library preparation. In fact, we have employed some of the same principles of the preceding methodology to CLIP-Seq experiments resulting in robust library PCR relative to previously published reports13 (data not shown). Finally, as serum miRNAs gain traction as a medical diagnostic tool, we anticipate the preceding methodology will aid greatly in identification of the most robust and reliable miRNA biomarkers.
The authors have nothing to disclose.
The authors would like to thank members of the Yi laboratory especially Zhaojie Zhang for fruitful discussions regarding linker design and ligation efficiencies, as well as the American Cancer Society for supporting this work through a postdoctoral fellowship (#125209) to J.E.L. Research reported in this publication was also supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health under Award Number R01AR059697 (to R.Y.) and a research grant from the Linda Crnic Institute for Down Syndrome. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Name of Reagent/ Equipment | Company | Catalog Number | Comments/Description |
3' Linker (5' phosphorylated, 3' blocked) | Integrated DNA Technologies | custom | |
5' Linker | Integrated DNA Technologies | custom | 5' blocked, HPLC Purified |
T4RNL2 (1-249 K227Q) | New England Biolabs | M0351S | Specialized for ligation of pre-adenylated DNA adapters |
10X Ligation Buffer (without ATP) | New England Biolabs | Included with M0351S | |
10X Ligation Buffer (with ATP) | New England Biolabs | Included with M0204L | |
RNaseOUT | Invitrogen | 10777-019 | |
Polyethylene Glycol (mol. Wt. 8000) | New England Biolabs | Included with M0204L | |
Nuclease-free water | Ambion | AM9937 | We have found water collected from a distillation apparatus to be of equvalent quality. |
T4RNL1 | New England Biolabs | M0204L | |
Superscript III RT kit | Invitrogen | 18080-051 | |
Phusion PCR kit | New England Biolabs | M0530S | |
Illumina RP1 Primer | Integrated DNA Technologies | custom | Sequence information available from Illumina |
Illumina RT Primer | Integrated DNA Technologies | custom | Sequence information available from Illumina |
Illumina Index Primer(s) | Integrated DNA Technologies | custom | Sequence information available from Illumina |
40% Acrylamide | Fisher Scientific | BP14081 | |
Urea | Sigma Aldrich | U6504 | |
Ammonium persulfate | Sigma Aldrich | A3678 | |
Tetramethyethylenediamine (TEMED) | Sigma Aldrich | T9281 | |
2X Denaturing RNA loading buffer | New England Biolabs | Included with M0351S | |
Razor blades | VWR | 55411-050 | |
SpinX Centricon Tubes | Costar | CLS8161 | |
Low Retention Microfuge tubes | Fisher Scientific | 02-681-320 | |
Sybr Gold | Invitrogen | S-11494 | |
Adenylation Kit | New England Biolabs | E2610L |