We’re moving! Datasets in the NIAGADS database are being transitioned to the DSS database, click to learn more.

CoRAL: predicting non-coding RNAs from small RNA-sequencing data.

TitleCoRAL: predicting non-coding RNAs from small RNA-sequencing data.
Publication TypeJournal Article
Year of Publication2013
AuthorsLeung, YYee, Ryvkin, P, Ungar, LH, Gregory, BD, San Wang, L-
JournalNucleic Acids Res
Volume41
Issue14
Paginatione137
Date Published2013 Aug
ISSN1362-4962
KeywordsAlgorithms, Artificial Intelligence, Classification, Humans, RNA, Small Untranslated, Sequence Analysis, RNA
Abstract

The surprising observation that virtually the entire human genome is transcribed means we know little about the function of many emerging classes of RNAs, except their astounding diversities. Traditional RNA function prediction methods rely on sequence or alignment information, which are limited in their abilities to classify the various collections of non-coding RNAs (ncRNAs). To address this, we developed Classification of RNAs by Analysis of Length (CoRAL), a machine learning-based approach for classification of RNA molecules. CoRAL uses biologically interpretable features including fragment length and cleavage specificity to distinguish between different ncRNA populations. We evaluated CoRAL using genome-wide small RNA sequencing data sets from four human tissue types and were able to classify six different types of RNAs with ∼80% cross-validation accuracy. Analysis by CoRAL revealed that microRNAs, small nucleolar and transposon-derived RNAs are highly discernible and consistent across all human tissue types assessed, whereas long intergenic ncRNAs, small cytoplasmic RNAs and small nuclear RNAs show less consistent patterns. The ability to reliably annotate loci across tissue types demonstrates the potential of CoRAL to characterize ncRNAs using small RNA sequencing data in less well-characterized organisms.

DOI10.1093/nar/gkt426
Alternate JournalNucleic Acids Res.
PubMed ID23700308
PubMed Central IDPMC3737537
Grant ListU01-AG032984 / AG / NIA NIH HHS / United States
R01-GM099962 / GM / NIGMS NIH HHS / United States
R01 GM099962 / GM / NIGMS NIH HHS / United States
T32-HG000046 / HG / NHGRI NIH HHS / United States
U01 AG032984 / AG / NIA NIH HHS / United States
U24-AG041689 / AG / NIA NIH HHS / United States
U24 AG041689 / AG / NIA NIH HHS / United States
T32 HG000046 / HG / NHGRI NIH HHS / United States
P30-AG10124 / AG / NIA NIH HHS / United States
P30 AG036468 / AG / NIA NIH HHS / United States