Using machine learning and high-throughput RNA sequencing to classify the precursors of small non-coding RNAs.

TitleUsing machine learning and high-throughput RNA sequencing to classify the precursors of small non-coding RNAs.
Publication TypeJournal Article
Year of Publication2014
AuthorsRyvkin, P, Leung, YYee, Ungar, LH, Gregory, BD, San Wang, L-
JournalMethods
Volume67
Issue1
Pagination28-35
Date Published2014 May 01
ISSN1095-9130
KeywordsAlgorithms, Animals, Artificial Intelligence, Base Sequence, Decision Trees, Entropy, High-Throughput Nucleotide Sequencing, Humans, Inverted Repeat Sequences, Molecular Sequence Data, Nucleic Acid Conformation, RNA Processing, Post-Transcriptional, RNA, Small Untranslated, Sequence Analysis, RNA
Abstract

Recent advances in high-throughput sequencing allow researchers to examine the transcriptome in more detail than ever before. Using a method known as high-throughput small RNA-sequencing, we can now profile the expression of small regulatory RNAs such as microRNAs and small interfering RNAs (siRNAs) with a great deal of sensitivity. However, there are many other types of small RNAs (<50nt) present in the cell, including fragments derived from snoRNAs (small nucleolar RNAs), snRNAs (small nuclear RNAs), scRNAs (small cytoplasmic RNAs), tRNAs (transfer RNAs), and transposon-derived RNAs. Here, we present a user's guide for CoRAL (Classification of RNAs by Analysis of Length), a computational method for discriminating between different classes of RNA using high-throughput small RNA-sequencing data. Not only can CoRAL distinguish between RNA classes with high accuracy, but it also uses features that are relevant to small RNA biogenesis pathways. By doing so, CoRAL can give biologists a glimpse into the characteristics of different RNA processing pathways and how these might differ between tissue types, biological conditions, or even different species. CoRAL is available at http://wanglab.pcbi.upenn.edu/coral/.

DOI10.1016/j.ymeth.2013.10.002
Alternate JournalMethods
PubMed ID24145223
PubMed Central IDPMC3991776
Grant ListU01-AG032984 / AG / NIA NIH HHS / United States
P30 AG010124 / AG / NIA NIH HHS / United States
P30-AG010124 / AG / NIA NIH HHS / United States
R01-GM099962 / GM / NIGMS NIH HHS / United States
R01 GM099962 / GM / NIGMS NIH HHS / United States
T32-HG000046 / HG / NHGRI NIH HHS / United States
U01 AG032984 / AG / NIA NIH HHS / United States
P50 NS053488 / NS / NINDS NIH HHS / United States
U24-AG041689 / AG / NIA NIH HHS / United States
P50- NS053488 / NS / NINDS NIH HHS / United States
U24 AG041689 / AG / NIA NIH HHS / United States
T32 HG000046 / HG / NHGRI NIH HHS / United States