DuplexFinder: Predicting the miRNA miRNA* duplex from the animal precursors
International Journal of Bioinformatics Research and Applications
PMirP: A pre-microRNA prediction method based on structure-sequence hybrid features
Artificial Intelligence in Medicine
Exploring the ncRNA-ncRNA patterns based on bridging rules
Journal of Biomedical Informatics
Graph evolution via social diffusion processes
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
An SVM-Based approach to discover MicroRNA precursors in plant genomes
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Prediction of pre-miRNA with multiple stem-loops using pruning algorithm
Computers in Biology and Medicine
miRClassify: An advanced web server for miRNA family classification and annotation
Computers in Biology and Medicine
Hi-index | 3.84 |
Motivation: MicroRNAs (miRNAs) are small ncRNAs participating in diverse cellular and physiological processes through the post-transcriptional gene regulatory pathway. Critically associated with the miRNAs biogenesis, the hairpin structure is a necessary feature for the computational classification of novel precursor miRNAs (pre-miRs). Though many of the abundant genomic inverted repeats (pseudo hairpins) can be filtered computationally, novel species-specific pre-miRs are likely to remain elusive. Results:miPred is a de novo Support Vector Machine (SVM) classifier for identifying pre-miRs without relying on phylogenetic conservation. To achieve significantly higher sensitivity and specificity than existing (quasi) de novo predictors, it employs a Gaussian Radial Basis Function kernel (RBF) as a similarity measure for 29 global and intrinsic hairpin folding attributes. They characterize a pre-miR at the dinucleotide sequence, hairpin folding, non-linear statistical thermodynamics and topological levels. Trained on 200 human pre-miRs and 400 pseudo hairpins, miPred achieves 93.50% (5-fold cross-validation accuracy) and 0.9833 (ROC score). Tested on the remaining 123 human pre-miRs and 246 pseudo hairpins, it reports 84.55% (sensitivity), 97.97% (specificity) and 93.50% (accuracy). Validated onto 1918 pre-miRs across 40 non-human species and 3836 pseudo hairpins, it yields 87.65% (92.08%), 97.75% (97.42%) and 94.38% (95.64%) for the mean (overall) sensitivity, specificity and accuracy. Notably, A.mellifera, A.geoffroyi, C.familiaris, E.Barr, H.Simplex virus, H.cytomegalovirus, O.aries, P.patens, R.lymphocryptovirus, Simian virus and Z.mays are unambiguously classified with 100.00% (sensitivity) and 93.75% (specificity). Availability: Data sets, raw statistical results and source codes are available at http://web.bii.a-star.edu.sg/~stanley/Publications Contact: stanley@bii.a-star.edu.sg; santosh@bii.a-star.edu.sg Supplementary information: Supplementary data are available at Bioinformatics online.