Transmembrane segments prediction and understanding using support vector machine and decision tree

Authors:
Jieyue He;Hae-Jin Hu;Robert Harrison;Phang C. Tai;Yi Pan
Affiliations:
Department of Computer Science, Southeast University, Nanjing 210096, China and Department of Computer Science, Georgia State University, Atlanta, GA 30303-4110, USA;Department of Computer Science, Georgia State University, Atlanta, GA 30303-4110, USA;Department of Computer Science, Georgia State University, Atlanta, GA 30303-4110, USA and Department of Biology, Georgia State University, Atlanta, GA 30303-4110, USA;Department of Biology, Georgia State University, Atlanta, GA 30303-4110, USA;Department of Computer Science, Georgia State University, Atlanta, GA 30303-4110, USA
Venue:
Expert Systems with Applications: An International Journal
Year:
2006

Citing 9
Cited 6

Support-Vector Networks

Machine Learning
Extraction of rules from discrete-time recurrent neural networks

Neural Networks
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

Machine Learning
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
An overview of protein-folding techniques: issues and perspectives

International Journal of Bioinformatics Research and Applications
NeC4.5: Neural Ensemble Based C4.5

IEEE Transactions on Knowledge and Data Engineering
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research
Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters

Expert Systems with Applications: An International Journal
The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks

IEEE Transactions on Neural Networks

Evaluation of ANN and SVM classifiers as predictors to the diagnosis of students with learning disabilities

Expert Systems with Applications: An International Journal
AptaCDSS-E: A classifier ensemble-based clinical decision support system for cardiovascular disease level prediction

Expert Systems with Applications: An International Journal
On the Importance of Comprehensible Classification Models for Protein Function Prediction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Review: Using support vector machines in diagnoses of urological dysfunctions

Expert Systems with Applications: An International Journal
Understandable learning machine system design for Transmembrane or Embedded Membrane segments prediction

International Journal of Data Mining and Bioinformatics
Applications of evolutionary SVM to prediction of membrane alpha-helices

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.06

Visualization

Abstract

In recent years, there have been many studies focusing on improving the accuracy of prediction of transmembrane segments, and many significant results have been achieved. In spite of these considerable results, the existing methods lack the ability to explain the process of how a learning result is reached and why a prediction decision is made. The explanation of a decision made is important for the acceptance of machine learning technology in bioinformatics applications such as protein structure prediction. While support vector machines (SVM) have shown strong generalization ability in a number of application areas, including protein structure prediction, they are black box models and hard to understand. On the other hand, decision trees provide insightful interpretation, however, they have lower prediction accuracy. In this paper, we present an innovative approach to rule generation for understanding prediction of transmembrane segments by integrating the merits of both SVMs and decision trees. This approach combines SVMs with decision trees into a new algorithm called SVM_DT. The results of the experiments for prediction of transmembrane segments on 165 low-resolution test data set show that not only the comprehensibility of SVM_DT is much better than that of SVMs, but also that the test accuracy of these rules is high as well. Rules with confidence values over 90% have an average prediction accuracy of 93.4%. We also found that confidence and prediction accuracy values of the rules generated by SVM_DT are quite consistent. We believe that SVM_DT can be used not only for transmembrane segments prediction, but also for understanding the prediction. The prediction and its interpretation obtained can be used for guiding biological experiments.