ProClusEnsem: Predicting membrane protein types by fusing different modes of pseudo amino acid composition

Authors:
Jingyan Wang;Yongping Li;Quanquan Wang;Xinge You;Jiaju Man;Chao Wang;Xin Gao
Affiliations:
Mathematical and Computer Sciences and Engineering Division, King Abdullah University of Science and Technology, Jeddah 21534, Saudi Arabia and Shanghai Institute of Applied Physics, Chinese Acade ...;Shanghai Institute of Applied Physics, Chinese Academy of Science, 2019 Jialuo Road, Jiading District, Shanghai 201800, PR China and Shanghai Key Laboratory of Intelligent Information Processing, ...;Shanghai Institute of Applied Physics, Chinese Academy of Science, 2019 Jialuo Road, Jiading District, Shanghai 201800, PR China;Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China and Key Laboratory of High Performance Computing and Stochastic Inf ...;Key Laboratory of High Performance Computing and Stochastic Information Processing, Ministry of Education of China, College of Mathematics and Computer Science, Hunan Normal University, Changsha, ...;Department of Biomedical Engineering, Oregon Health and Science University, 20000 NW Walker Rd., Beaverton, OR 97006, USA;Mathematical and Computer Sciences and Engineering Division, King Abdullah University of Science and Technology, Jeddah 21534, Saudi Arabia
Venue:
Computers in Biology and Medicine
Year:
2012

Citing 17
Cited 1

Prediction of protein subcellular locations using fuzzy k-NN method

Bioinformatics
Incorporating PCA and fuzzy-ART techniques into achieve organism classification based on codon usage consideration

Computers in Biology and Medicine
Geometry-Based Ensembles: Toward a Structural Characterization of the Classification Boundary

IEEE Transactions on Pattern Analysis and Machine Intelligence
PCA and KPCA for Predicting Membrane Protein Types

GCIS '09 Proceedings of the 2009 WRI Global Congress on Intelligent Systems - Volume 02
An effective multi-biometrics solution for embedded device

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Expectation-maximization technique for fibro-glandular discs detection in mammography images

Computers in Biology and Medicine
Short Communication: A novel local preserving projection scheme for use with face recognition

Expert Systems with Applications: An International Journal
Predicting membrane protein types dimensionality reduction and kernel

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Local-Learning-Based Feature Selection for High-Dimensional Data Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast agglomerative clustering using information of k-nearest neighbors

Pattern Recognition
Local Distance Functions: A Taxonomy, New Algorithms, and an Evaluation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Short communication: Automated, rapid classification of signals using locally linear embedding

Expert Systems with Applications: An International Journal
Learning context-sensitive similarity by shortest path propagation

Pattern Recognition
How to handle missing data in robust multi-biometrics verification

International Journal of Biometrics
Robust prediction of protein subcellular localization combining PCA and WSVMs

Computers in Biology and Medicine
Segmentation of the mean of heteroscedastic data via cross-validation

Statistics and Computing
Boosted Learning of Visual Word Weighting Factors for Bag-of-Features Based Medical Image Retrieval

ICIG '11 Proceedings of the 2011 Sixth International Conference on Image and Graphics

Feature extraction in protein sequences classification: a new stability measure

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Knowing the type of an uncharacterized membrane protein often provides a useful clue in both basic research and drug discovery. With the explosion of protein sequences generated in the post genomic era, determination of membrane protein types by experimental methods is expensive and time consuming. It therefore becomes important to develop an automated method to find the possible types of membrane proteins. In view of this, various computational membrane protein prediction methods have been proposed. They extract protein feature vectors, such as PseAAC (pseudo amino acid composition) and PsePSSM (pseudo position-specific scoring matrix) for representation of protein sequence, and then learn a distance metric for the KNN (K nearest neighbor) or NN (nearest neighbor) classifier to predicate the final type. Most of the metrics are learned using linear dimensionality reduction algorithms like Principle Components Analysis (PCA) and Linear Discriminant Analysis (LDA). Such metrics are common to all the proteins in the dataset. In fact, they assume that the proteins lie on a uniform distribution, which can be captured by the linear dimensionality reduction algorithm. We doubt this assumption, and learn local metrics which are optimized for local subset of the whole proteins. The learning procedure is iterated with the protein clustering. Then a novel ensemble distance metric is given by combining the local metrics through Tikhonov regularization. The experimental results on a benchmark dataset demonstrate the feasibility and effectiveness of the proposed algorithm named ProClusEnsem.