Absolute quantification RNA sequencing (AQRNA-seq) is a technology developed to quantify the landscape of all small RNAs in biological mixtures. Here, both the library preparation and data processing steps of AQRNA-seq are demonstrated, quantifying changes in the transfer RNA (tRNA) pool in Mycobacterium bovis BCG during starvation-induced dormancy.
AQRNA-seq provides a direct linear relationship between sequencing read counts and small RNA copy numbers in a biological sample, thus enabling accurate quantification of the pool of small RNAs. The AQRNA-seq library preparation procedure described here involves the use of custom-designed sequencing linkers and a step for reducing methylation RNA modifications that block reverse transcription processivity, which results in an increased yield of full-length cDNAs. In addition, a detailed implementation of the accompanying bioinformatics pipeline is presented. This demonstration of AQRNA-seq was conducted through a quantitative analysis of the 45 tRNAs in Mycobacterium bovis BCG harvested on 5 selected days across a 20-day time course of nutrient deprivation and 6 days of resuscitation. Ongoing efforts to improve the efficiency and rigor of AQRNA-seq will also be discussed here. This includes exploring methods to obviate gel purification for mitigating primer dimer issues after PCR amplification and to increase the proportion of full-length reads to enable more accurate read mapping. Future enhancements to AQRNA-seq will be focused on facilitating automation and high-throughput implementation of this technology for quantifying all small RNA species in cell and tissue samples from diverse organisms.
Next-generation sequencing (NGS), also known as massively parallel sequencing, is a DNA sequencing technology that involves DNA fragmentation, ligation of adaptor oligonucleotides, polymerase chain reaction (PCR)-based amplification, sequencing of the DNA, and reassembly of the fragment sequences into a genome. The adaptation of NGS to sequence RNA (RNA-seq) is a powerful approach to identify and quantify RNA transcripts and their variants1. Innovative developments in RNA library preparation workflows and bioinformatic analysis pipelines, coupled with advancements in laboratory instrumentation, have expanded the repertoire of RNA-seq applications, progressing beyond exome sequencing into advanced functional omics like non-coding RNA profiling2, single cell analysis3, spatial transcriptomics4,5, alternative splicing analysis6, among others. These advanced RNA-seq methods reveal complex RNA functions through quantitative analysis of the transcriptome in normal and diseased cells and tissue.
Despite these advances in RNA-seq, several key technical features limit the quantitative power of the method. While most RNA-seq methods allow precise and accurate quantification of changes in the levels of RNAs between experimental variables (i.e., biological samples and/or physiological states), they cannot provide quantitative comparisons of the levels of RNA molecules within a sample. For example, most RNA-seq methods cannot accurately quantify the relative number of copies of individual tRNA isoacceptor molecules in a cellular pool of expressed tRNAs. As highlighted in the companion publication7, this limitation to RNA-seq arises from several features of RNA structure and the biochemistry of library preparation. For example, the activity of the ligation enzymes used to attach the 3'- and 5'-end sequencing linkers to RNA molecules is strongly influenced by the identity of the terminal nucleotides of the RNA and the sequencing linkers. This leads to large variations in efficiencies of linker ligations and profound artifactual increases in sequencing reads8,9,10.
A second set of limitations arise from the inherent structural properties of RNA molecules. Specifically, RNA secondary structure formation and dynamic changes in the dozens of post-transcriptional RNA modifications of the epitranscriptome can cause polymerase fall-off or mutation during reverse transcription. These errors result in incomplete or truncated cDNA synthesis or altered RNA sequence. While both of these phenomena can be exploited to map secondary structures or some modifications, they degrade the quantitative accuracy of RNA-seq if subsequent library preparation steps fail to capture truncated cDNAs or if data processing throws out mutated sequences not matching a reference dataset11,12. Furthermore, the immense chemical, length, and structural diversity of RNA transcripts, as well as the lack of tools to uniformly fragment long RNAs, diminishes the applicability of most RNA-seq methods to all RNA species13.
The AQRNA-seq (absolute quantification RNA sequencing) method has been developed to remove several of these technical and biological constraints that limit quantitative accuracy7. By minimizing sequence-dependent biases in capture, ligation, and amplification during RNA sequencing library preparation, AQRNA-seq achieves superior linearity compared to other methods, accurately quantifying 75% of a reference library of 963 miRNAs within 2-fold accuracy. This linear correlation of sequencing read count and RNA abundance is also observed in an analysis of a variable length pool of RNA oligonucleotide standards and in reference to orthogonal methods like northern blotting. Establishing linearity between sequencing read count and RNA abundance enables AQRNA-seq to achieve accurate, absolute quantification of all RNA species within a sample.
Here is a description of the protocol for AQRNA-seq library preparation workflow and the accompanying downstream data analytics pipeline. The method was applied to elucidate the dynamics of tRNA abundance during starvation-induced dormancy and subsequent resuscitation in the Mycobacterium bovis bacilli de Calmette et Guérin (BCG) model of tuberculosis. Results were presented for the exploratory visualization of the sequencing data, along with subsequent clustering and differential expression analyses that unveiled discernable patterns in tRNA abundance associated with various phenotypes.
The AQRNA-seq library preparation workflow is designed to maximize the capture of RNAs within a sample and minimize polymerase fall-off during reverse transcription7. Through a two-step linker ligation, novel DNA oligos (Linker 1 and Linker 2) are ligated in excess to fully complement the RNA within the sample. Excess linkers can be efficiently removed with RecJf, a 5' to 3' exonuclease specific to single-stranded DNAs, leaving the ligated products intact. In addition, AlkB treatment reduc…
The authors have nothing to disclose.
The authors of the present work are grateful to the authors of the original paper describing the AQRNA-seq technology7. This work was supported by grants from the National Institutes of Health (ES002109, AG063341, ES031576, ES031529, ES026856) and the National Research Foundation of Singapore through the Singapore-MIT Alliance for Research and Technology Antimicrobial Resistance IRG.
2-ketoglutarate | Sigma-Aldrich | 75890 | Prepare a working solution (1 M) and store it at -20ºC |
2100 Bioanalyzer Instrument | Agilent | G2938C | |
5'-deadenylase (50 U/μL) | New England Biolabs | M0331S (component #: M0331SVIAL) | Store at -20 °C |
Adenosine 5'-Triphosphate (ATP) | New England Biolabs | M0437M (component #: N0437AVIAL) | NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM); prepare a working solution (10 mM) and store it at -20ºC |
AGAROSE GPG/LE | AmericanBio | AB00972-00500 | Store at ambient temperature |
Ammonium iron(II) sulfate hexahydrate | Sigma-Aldrich | F2262 | Prepare a working solution (0.25 M) and store it at -20 °C |
Bioanalyzer Small RNA Analysis | Agilent | 5067-1548 | The Small RNA Analysis is used for checking the quality of input RNAs and the efficiency of enzymatic reactions (e.g., Linker 1 ligation) |
Bovine Serum Albumin (BSA; 10 mg/mL) | New England Biolabs | B9000 | This product was discontinued on 12/15/2022 and is replaced with Recombinant Albumin, Molecular Biology Grade (NEB B9200). |
Chloroform | Macron Fine Chemicals | 4441-10 | |
Demethylase | ArrayStar | AS-FS-004 | Demethylase comes with the rtStar tRNA Pretreatment & First-Strand cDNA Synthesis Kit (AS-FS-004) |
Deoxynucleotide (dNTP) Solution Mix | New England Biolabs | N0447L (component #: N0447LVIAL) | This dNTP Solution Mix contains equimolar concentrations of dATP, dCTP, dGTP and dTTP (10 mM each) |
Digital Dual Heat Block | VWR Scientific Products | 13259-052 | Heating block is used with the QIAquick Gel Extraction Kit |
DyeEx 2.0 Spin Kit | Qiagen | 63204 | Effective at removing short remnants (e.g., oligos less than 10 bp in length) |
Electrophoresis Power Supply | Bio-Rad Labrotories | PowerPac 300 | |
Eppendorf PCR Tubes (0.5 mL) | Eppendorf | 0030124537 | |
Eppendorf Safe-Lock Tubes (0.5 mL) | Eppendorf | 022363611 | |
Eppendorf Safe-Lock Tubes (1.5 mL) | Eppendorf | 022363204 | |
Eppendorf Safe-Lock Tubes (2 mL) | Eppendorf | 022363352 | |
Ethyl alcohol (Ethanol), Pure | Sigma-Aldrich | E7023 | The pure ethanol is used with the Oligo Clean and Concentrator Kit from Zymo Research |
Gel Imaging System | Alpha Innotech | FluorChem 8900 | |
Gel Loading Dye, Purple (6X), no SDS | New England Biolabs | N0556S (component #: B7025SVIAL) | NEB N0556S contains Quick-Load Purple 50 bp DNA Ladder and Gel Loading Dye, Purple (6X), no SDS |
GENESYS 180 UV-Vis Spectrophotometer | Thermo Fisher Scientific | 840-309000 | The spectrophotometer is used for measuring the oligo concentrations using the Beer's law |
HEPES | Sigma-Aldrich | H4034 | Prepare a working solution (1 M; pH = 8 with NaOH) and store it at -20 °C |
Hydrochloric acid (HCl) | VWR Scientific Products | BDH3028 | Prepare a working solution (5 M) and store it at ambient temperature |
Isopropyl Alcohol (Isopropanol), Pure | Macron Fine Chemicals | 3032-16 | Isopropanol is used with the QIAquick Gel Extraction Kit |
L-Ascorbic acid | Sigma-Aldrich | A5960 | Prepare a working solution (0.5 M) and store it at -20ºC |
Microcentrifuge | Eppendorf | 5415D | |
NanoDrop 2000 Spectrophotometer | Thermo Fisher Scientific | ND-2000 | |
NEBuffer 2 (10X) | New England Biolabs | M0264L (component #: B7002SVIAL) | NEB M0264L contains RecJf (30 U/μL) and NEBuffer 2 (10X); store at -20 °C |
Nuclease-Free Water (not DEPC-Treated) | Thermo Fisher Scientific | AM9938 | |
Oligo Clean & Concentrator Kit | Zymo Research | D4061 | Store at ambient temperature |
PEG 8000 (50% solution) | New England Biolabs | M0437M (component #: B1004SVIAL) | NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM); prepare a working solution (10 mM) and store it at -20ºC |
Peltier Thermal Cycler | MJ Research | PTC-200 | |
Phenol:choloroform:isoamyl alcohol 25:24:1 pH = 5.2 | Thermo Fisher Scientific | J62336 | |
PrimeScript Buffer (5X) | TaKaRa | 2680A | |
PrimeScript Reverse Transcriptase | TaKaRa | 2680A | |
QIAquick Gel Extraction Kit | Qiagen | 28704 | This kit requires a heating block and isopropanol to work with |
Quick-Load Purple 100 bp DNA Ladder | New England Biolabs | N0551S (component #: N0551SVIAL) | |
Quick-Load Purple 50 bp DNA Ladder | New England Biolabs | N0556S (component #: N0556SVIAL) | NEB N0556S contains Quick-Load Purple 50 bp DNA Ladder and Gel Loading Dye, Purple (6X), no SDS |
RecJf (30 U/μL) | New England Biolabs | M0264L (component #: M0264LVIAL) | NEB M0264L contains RecJf (30 U/μL) and NEBuffer 2 (10X); store at -20 °C |
RNase Inhibitor (murine; 40 U/μL) | New England Biolabs | M0314L (component #: M0314LVIAL) | Store at -20 °C |
SeqAMP DNA Polymerase | TaKaRa | 638509 | TaKaRa 638509 contains SeqAMP DNA Polymerase and SeqAMP PCR Buffer (2X) |
SeqAMP PCR Buffer (2X) | TaKaRa | 638509 | TaKaRa 638509 contains SeqAMP DNA Polymerase and SeqAMP PCR Buffer (2X) |
Shrimp Alkaline Phosphatase (1 U/μL) | New England Biolabs | M0371L (component #: M0371LVIAL) | |
Sodium hydroxide (NaOH) | Sigma-Aldrich | S5881 | Prepare a working solution (5 M) and store it at ambient temperature |
T4 DNA Ligase (400 U/μL) | New England Biolabs | M0202L (component #: M0202LVIAL) | NEB M0202L contains T4 DNA Ligase (400 U/μL) and T4 DNA Ligase Reaction Buffer (10X) |
T4 DNA Ligase Reaction Buffer (10X) | New England Biolabs | M0202L (component #: B0202SVIAL) | NEB M0202L contains T4 DNA Ligase (400 U/μL) and T4 DNA Ligase Reaction Buffer (10X) |
T4 RNA Ligase 1 (30 U/μL) | New England Biolabs | M0437M (component #: M0437MVIAL) | NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM) |
T4 RNA Ligase Reaction Buffer (10X) | New England Biolabs | M0437M (component #: B0216SVIAL) | NEB M0437M contains T4 RNA Ligase 1 (30 U/μL), T4 RNA Ligase Reaction Buffer (10X), PEG 8000 (1X), and ATP (100 mM) |
.