Penalized principal component analysis of microarray data

Authors:
Vladimir Nikulin;Geoffrey J. McLachlan
Affiliations:
Department of Mathematics, University of Queensland;Department of Mathematics, University of Queensland
Venue:
CIBB'09 Proceedings of the 6th international conference on Computational intelligence methods for bioinformatics and biostatistics
Year:
2009

Citing 3
Cited 0

Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Classification based upon gene expression data

Bioinformatics
A novel ensemble machine learning for robust microarray data classification

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

The high dimensionality of microarray data, the expressions of thousands of genes in a much smaller number of samples, presents challenges that affect the validity of the analytical results. Hence attention has to be given to some form of dimension reduction to represent the data in terms of a smaller number of variables. The latter are often chosen to be a linear combinations of the original variables (genes) called metagenes. One commonly used approach is principal component analysis (PCA), which can be implemented via a singular value decomposition (SVD). However, in the case of a high-dimensional matrix, SVD may be very expensive in terms of computational time. We propose to reduce the SVD task to the ordinary maximisation problem with an Euclidean norm which may be solved easily using gradient-based optimisation. We demonstrate the effectiveness of this approach to the supervised classification of gene expression data.