Identifying Bacterial Virulent Proteins by Fusing a Set of Classifiers Based on Variants of Chou's Pseudo Amino Acid Composition and on Evolutionary Information

Authors:
Loris Nanni;Alessandra Lumini;Dinesh Gupta;Aarti Garg
Affiliations:
University of Padua, Padua;University of Bologna, Cesena;International Centre for Genetic Engineering and Biotechnology, New Delhi;International Centre for Genetic Engineering and Biotechnology, New Delhi
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2012

Citing 13
Cited 4

Floating search methods in feature selection

Pattern Recognition Letters
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
The Design of Innovation: Lessons from and for Competent Genetic Algorithms

The Design of Innovation: Lessons from and for Competent Genetic Algorithms
SPAAN: a software program for prediction of adhesins and adhesin-like proteins using neural networks

Bioinformatics
Neighborhood Preserving Embedding

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Ensemble classifier for protein fold pattern recognition

Bioinformatics
An ensemble of K-local hyperplanes for predicting protein--protein interactions

Bioinformatics
Ensemble generation and feature selection for the identification of students with learning disabilities

Expert Systems with Applications: An International Journal
An ensemble of support vector machines for predicting virulent proteins

Expert Systems with Applications: An International Journal
Input Decimated Ensemble based on Neighborhood Preserving Embedding for spectrogram classification

Expert Systems with Applications: An International Journal

Prediction of human major histocompatibility complex class II binding peptides by continuous kernel discrimination method

Artificial Intelligence in Medicine
Multilabel Learning via Random Label Selection for Protein Subcellular Multilocations Prediction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Predicting the protein solubility by integrating chaos games representation and entropy in information theory

Expert Systems with Applications: An International Journal
Ensemble classification of colon biopsy images based on information rich hybrid features

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

The availability of a reliable prediction method for prediction of bacterial virulent proteins has several important applications in research efforts targeted aimed at finding novel drug targets, vaccine candidates, and understanding virulence mechanisms in pathogens. In this work, we have studied several feature extraction approaches for representing proteins and propose a novel bacterial virulent protein prediction method, based on an ensemble of classifiers where the features are extracted directly from the amino acid sequence and from the evolutionary information of a given protein. We have evaluated and compared several ensembles obtained by combining six feature extraction methods and several classification approaches based on two general purpose classifiers (i.e., Support Vector Machine and a variant of input decimated ensemble) and their random subspace version. An extensive evaluation was performed according to a blind testing protocol, where the parameters of the system are optimized using the training set and the system is validated in three different independent data sets, allowing selection of the most performing system and demonstrating the validity of the proposed method. Based on the results obtained using the blind test protocol, it is interesting to note that even if in each independent data set the most performing stand-alone method is not always the same, the fusion of different methods enhances prediction efficiency in all the tested independent data sets.