Independent component analysis-based penalized discriminant method for tumor classification using gene expression data

  • Authors:
  • De-Shuang Huang;Chun-Hou Zheng

  • Affiliations:
  • Intelligent Computing Lab, Institute of Intelligent Machines, Chinese Academy of Sciences PO Box 1130, Hefei, Anhui 230031, China;Intelligent Computing Lab, Institute of Intelligent Machines, Chinese Academy of Sciences PO Box 1130, Hefei, Anhui 230031, China

  • Venue:
  • Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 3.85

Visualization

Abstract

Motivation: Microarrays are capable of determining the expression levels of thousands of genes simultaneously. One important application of gene expression data is classification of samples into categories. In combination with classification methods, this technology can be useful to support clinical management decisions for individual patients, e.g. in oncology. Standard statistic methodologies in classification or prediction do not work well when the number of variables p (genes) far too exceeds the number of samples n. So, modification of existing statistical methodologies or development of new methodologies is needed for the analysis of microarray data. Results: This paper proposes a new method for tumor classification using gene expression data. In this method, we first employ independent component analysis to model the gene expression data, then apply optimal scoring algorithm to classify them. Further speaking, this approach can first make full use of the high-order statistical information contained in the gene expression data. Second, this approach also employs regularized regression models to handle the situation of large numbers of correlated predictor variables. Finally, the predictive models are developed for classifying tumors based on the entire gene expression profile. To show the validity of the proposed method, we apply it to classify four DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible. Availability: Matlab scripts are available on request. Contact: dshuang@iim.ac.cn