SpliceIT: A hybrid method for splice signal identification based on probabilistic and biological inference

Authors:
Andigoni Malousi;Ioanna Chouvarda;Vassilis Koutkias;Sofia Kouidou;Nicos Maglaveras
Affiliations:
Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Biological Chemistry, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece
Venue:
Journal of Biomedical Informatics
Year:
2010

Citing 13
Cited 0

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
An introduction to variable and feature selection

The Journal of Machine Learning Research
Finding short DNA motifs using permuted markov models

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Selective Markov models for predicting Web page accesses

ACM Transactions on Internet Technology (TOIT)
Markov Encoding for Detecting Signals in Genomic Sequences

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
SpliceMachine: predicting splice sites from high-dimensional local context representations

Bioinformatics
Prediction of splice sites with dependency graphs and their expanded bayesian networks

Bioinformatics
Combined SVM-Based Feature Selection and Classification

Machine Learning
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Large Scale Multiple Kernel Learning

The Journal of Machine Learning Research
Mining longest repeating subsequences to predict world wide web surfing

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2
TopHat

Bioinformatics
Splice site prediction using support vector machines with a Bayes kernel

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Splice sites define the boundaries of exonic regions and dictate protein synthesis and function. The splicing mechanism involves complex interactions among positional and compositional features of different lengths. Computational modeling of the underlying constructive information is especially challenging, in order to decipher splicing-inducing elements and alternative splicing factors. SpliceIT (Splice Identification Technique) introduces a hybrid method for splice site prediction that couples probabilistic modeling with discriminative computational or experimental features inferred from published studies in two subsequent classification steps. The first step is undertaken by a Gaussian support vector machine (SVM) trained on the probabilistic profile that is extracted using two alternative position-dependent feature selection methods. In the second step, the extracted predictions are combined with known species-specific regulatory elements, in order to induce a tree-based modeling. The performance evaluation on human and Arabidopsis thaliana splice site datasets shows that SpliceIT is highly accurate compared to current state-of-the-art predictors in terms of the maximum sensitivity, specificity tradeoff without compromising space complexity and in a time-effective way. The source code and supplementary material are available at: http://www.med.auth.gr/research/spliceit/.