Regularized discriminant analysis for high dimensional, low sample size data

  • Authors:
  • Jieping Ye;Tie Wang

  • Affiliations:
  • Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ

  • Venue:
  • Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linear and Quadratic Discriminant Analysis have been used widely in many areas of data mining, machine learning, and bioinformatics. Friedman proposed a compromise between Linear and Quadratic Discriminant Analysis, called Regularized Discriminant Analysis (RDA), which has been shown to be more flexible in dealing with various class distributions. RDA applies the regularization techniques by employing two regularization parameters, which are chosen to jointly maximize the classification performance. The optimal pair of parameters is commonly estimated via cross-validation from a set of candidate pairs. It is computationally prohibitive for high dimensional data, especially when the candidate set is large, which limits the applications of RDA to low dimensional data.In this paper, a novel algorithm for RDA is presented for high dimensional data. It can estimate the optimal regularization parameters from a large set of parameter candidates efficiently. Experiments on a variety of datasets confirm the claimed theoretical estimate of the efficiency, and also show that, for a properly chosen pair of regularization parameters, RDA performs favorably in classification, in comparison with other existing classification methods.