Dimensionality reduction of protein mass spectrometry data using random projection

  • Authors:
  • Chen Change Loy;Weng Kin Lai;Chee Peng Lim

  • Affiliations:
  • Grid Computing and Bioinformatics Lab, MIMOS Berhad, Kuala Lumpur, Malaysia;Grid Computing and Bioinformatics Lab, MIMOS Berhad, Kuala Lumpur, Malaysia;School of Electrical & Electronic Engineering, University of Science Malaysia, Nibong Tebal, Penang, Malaysia

  • Venue:
  • ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein mass spectrometry (MS) pattern recognition has recently emerged as a new method for cancer diagnosis. Unfortunately, classification performance may degrade owing to the enormously high dimensionality of the data. This paper investigates the use of Random Projection in protein MS data dimensionality reduction. The effectiveness of Random Projection (RP) is analyzed and compared against Principal Component Analysis (PCA) by using three classification algorithms, namely Support Vector Machine, Feed-forward Neural Networks and K-Nearest Neighbour. Three real-world cancer data sets are employed to evaluate the performances of RP and PCA. Through the investigations, RP method demonstrated better or at least comparable classification performance as PCA if the dimensionality of the projection matrix is sufficiently large. This paper also explores the use of RP as a pre-processing step prior to PCA. The results show that without sacrificing classification accuracy, performing RP prior to PCA significantly improves the computational time.