On the number of partial least squares components in dimension reduction for tumor classification

  • Authors:
  • Xue-Qiang Zeng;Guo-Zheng Li;Geng-Feng Wu;Hua-Xing Zou

  • Affiliations:
  • Computer Center, Nanchang University, Nanchang, China and School of Computer Engineering & Science, Shanghai University, Shanghai, China;School of Computer Engineering & Science, Shanghai University, Shanghai, China;School of Computer Engineering & Science, Shanghai University, Shanghai, China;Computer Center, Nanchang University, Nanchang, China

  • Venue:
  • PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dimension reduction is important during the analysis of gene expression microarray data, because the high dimensionality of data sets hurts the generalization performance of classifiers. Partial Least Squares (PLS) based dimension reduction is a frequently used method, since it is specialized in handling high dimensional data set and leads to satisfying classification performance. This paper investigates the influence on generalization performance caused by the variation of the number of PLS components and the relationship between classification performance and regression quality of PLS on the training set. Experimental results show that the number of PLS components for classifiers can be automatically determined by regression quality of PLS latent variables.