Machine learning for HIV-1 protease cleavage site prediction

Authors:
Alessandra Lumini;Loris Nanni
Affiliations:
DEIS, IEIIT - CNR, Universití di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy;DEIS, IEIIT - CNR, Universití di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy
Venue:
Pattern Recognition Letters
Year:
2006

Citing 8
Cited 2

Bagging predictors

Machine Learning
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Artificial neural network model for predicting HIV protease cleavage sites in protein

Advances in Engineering Software
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Theoretical and Experimental Analysis of a Two-Stage System for Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
On the Decomposition of Polychotomies into Dichotomies

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An empirical comparison of supervised machine learning techniques in bioinformatics

APBC '03 Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003 - Volume 19
Why neural networks should not be used for HIV-1 protease cleavage site prediction

Bioinformatics

Using pseudo amino acid composition to predict protein subnuclear localization: Approached with PSSM

Pattern Recognition Letters
A genetic encoding approach for learning methods for combining classifiers

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.11

Visualization

Abstract

Recently, several works have approached the HIV-1 protease specificity problem by applying a number of classifier creation and combination methods, known as ensemble methods, from the field of machine learning. However, it is still difficult for researchers to choose the best method due to the lack of an effective comparison. For the first time we have made an extensive study on methods for feature extraction, feature transformation and multiclassifier systems (MCS) in the problem of HIV-1 protease. In this work we report an experimental comparison on several learning systems coupled with different feature representations. We confirm previous results stating that linear classifiers obtain higher performance than non-linear classifiers using orthonormal encoding, but we also show that using Karhunen-Loeve transform the performance of neural networks are comparable to one of linear support vector machines. Finally we propose a new hierarchical approach that, for the first time, combines ideas derived from the machine learning methodologies and from a knowledge base of this particular problem. This approach proves to be a successful attempt to obtain a drastically error reduction with respect to the performance of linear classifiers: the error rate decreases from 9.1% using linear-SVM to 6.6% using our new hierarchical classifier based on some pattern rules.