Computers in Biology and Medicine
Geometry-Based Ensembles: Toward a Structural Characterization of the Classification Boundary
IEEE Transactions on Pattern Analysis and Machine Intelligence
PCA and KPCA for Predicting Membrane Protein Types
GCIS '09 Proceedings of the 2009 WRI Global Congress on Intelligent Systems - Volume 02
An effective multi-biometrics solution for embedded device
SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Expectation-maximization technique for fibro-glandular discs detection in mammography images
Computers in Biology and Medicine
Short Communication: A novel local preserving projection scheme for use with face recognition
Expert Systems with Applications: An International Journal
Predicting membrane protein types dimensionality reduction and kernel
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Local-Learning-Based Feature Selection for High-Dimensional Data Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast agglomerative clustering using information of k-nearest neighbors
Pattern Recognition
Local Distance Functions: A Taxonomy, New Algorithms, and an Evaluation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Short communication: Automated, rapid classification of signals using locally linear embedding
Expert Systems with Applications: An International Journal
Learning context-sensitive similarity by shortest path propagation
Pattern Recognition
How to handle missing data in robust multi-biometrics verification
International Journal of Biometrics
Robust prediction of protein subcellular localization combining PCA and WSVMs
Computers in Biology and Medicine
Segmentation of the mean of heteroscedastic data via cross-validation
Statistics and Computing
Boosted Learning of Visual Word Weighting Factors for Bag-of-Features Based Medical Image Retrieval
ICIG '11 Proceedings of the 2011 Sixth International Conference on Image and Graphics
Feature extraction in protein sequences classification: a new stability measure
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Hi-index | 0.00 |
Knowing the type of an uncharacterized membrane protein often provides a useful clue in both basic research and drug discovery. With the explosion of protein sequences generated in the post genomic era, determination of membrane protein types by experimental methods is expensive and time consuming. It therefore becomes important to develop an automated method to find the possible types of membrane proteins. In view of this, various computational membrane protein prediction methods have been proposed. They extract protein feature vectors, such as PseAAC (pseudo amino acid composition) and PsePSSM (pseudo position-specific scoring matrix) for representation of protein sequence, and then learn a distance metric for the KNN (K nearest neighbor) or NN (nearest neighbor) classifier to predicate the final type. Most of the metrics are learned using linear dimensionality reduction algorithms like Principle Components Analysis (PCA) and Linear Discriminant Analysis (LDA). Such metrics are common to all the proteins in the dataset. In fact, they assume that the proteins lie on a uniform distribution, which can be captured by the linear dimensionality reduction algorithm. We doubt this assumption, and learn local metrics which are optimized for local subset of the whole proteins. The learning procedure is iterated with the protein clustering. Then a novel ensemble distance metric is given by combining the local metrics through Tikhonov regularization. The experimental results on a benchmark dataset demonstrate the feasibility and effectiveness of the proposed algorithm named ProClusEnsem.