Dimensionality reduction of protein mass spectrometry data using random projection

Authors:
Chen Change Loy;Weng Kin Lai;Chee Peng Lim
Affiliations:
Grid Computing and Bioinformatics Lab, MIMOS Berhad, Kuala Lumpur, Malaysia;Grid Computing and Bioinformatics Lab, MIMOS Berhad, Kuala Lumpur, Malaysia;School of Electrical & Electronic Engineering, University of Science Malaysia, Nibong Tebal, Penang, Malaysia
Venue:
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Year:
2006

Citing 4
Cited 1

Database-friendly random projections

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Experiments with Random Projection

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Dimension Reduction-Based Penalized Logistic Regression for Cancer Classification Using Microarray Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Protein mass spectrometry (MS) pattern recognition has recently emerged as a new method for cancer diagnosis. Unfortunately, classification performance may degrade owing to the enormously high dimensionality of the data. This paper investigates the use of Random Projection in protein MS data dimensionality reduction. The effectiveness of Random Projection (RP) is analyzed and compared against Principal Component Analysis (PCA) by using three classification algorithms, namely Support Vector Machine, Feed-forward Neural Networks and K-Nearest Neighbour. Three real-world cancer data sets are employed to evaluate the performances of RP and PCA. Through the investigations, RP method demonstrated better or at least comparable classification performance as PCA if the dimensionality of the projection matrix is sufficiently large. This paper also explores the use of RP as a pre-processing step prior to PCA. The results show that without sacrificing classification accuracy, performing RP prior to PCA significantly improves the computational time.