SpliceIT: A hybrid method for splice signal identification based on probabilistic and biological inference

  • Authors:
  • Andigoni Malousi;Ioanna Chouvarda;Vassilis Koutkias;Sofia Kouidou;Nicos Maglaveras

  • Affiliations:
  • Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Biological Chemistry, Medical School, Aristotle University of Thessaloniki, Greece;Lab. of Medical Informatics, Medical School, Aristotle University of Thessaloniki, Greece

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Splice sites define the boundaries of exonic regions and dictate protein synthesis and function. The splicing mechanism involves complex interactions among positional and compositional features of different lengths. Computational modeling of the underlying constructive information is especially challenging, in order to decipher splicing-inducing elements and alternative splicing factors. SpliceIT (Splice Identification Technique) introduces a hybrid method for splice site prediction that couples probabilistic modeling with discriminative computational or experimental features inferred from published studies in two subsequent classification steps. The first step is undertaken by a Gaussian support vector machine (SVM) trained on the probabilistic profile that is extracted using two alternative position-dependent feature selection methods. In the second step, the extracted predictions are combined with known species-specific regulatory elements, in order to induce a tree-based modeling. The performance evaluation on human and Arabidopsis thaliana splice site datasets shows that SpliceIT is highly accurate compared to current state-of-the-art predictors in terms of the maximum sensitivity, specificity tradeoff without compromising space complexity and in a time-effective way. The source code and supplementary material are available at: http://www.med.auth.gr/research/spliceit/.