PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis

  • Authors:
  • Shih-Wei Lin;Shih-Chieh Chen

  • Affiliations:
  • Department of Information Management, Chang Gung University, No. 259, Wen-Hwa 1st Road, Taoyuan 333, Taiwan, ROC;Department of Industrial Management, National Taiwan University of Science and Technology, No. 43, Keelung Road, Sec. 4, Taipei, Taiwan, ROC

  • Venue:
  • Applied Soft Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Linear discriminant analysis (LDA) is a commonly used classification method. It can provide important weight information for constructing a classification model. However, real-world data sets generally have many features, not all of which benefit the classification results. If a feature selection algorithm is not employed, unsatisfactory classification will result, due to the high correlation between features and noise. This study points out that the feature selection has influence on the LDA by showing an example. The methods traditionally used for LDA to determine the beneficial feature subset are not easy or cannot guarantee the best results when problems have larger number of features. The particle swarm optimization (PSO) is a powerful meta-heuristic technique in the artificial intelligence field; therefore, this study proposed a PSO-based approach, called PSOLDA, to specify the beneficial features and to enhance the classification accuracy rate of LDA. To measure the performance of PSOLDA, many public datasets are employed to measure the classification accuracy rate. Comparing the optimal result obtained by the exhaustive enumeration, the PSOLDA approach can obtain the same optimal result. Due to much time required for exhaustive enumeration when problems have larger number of features, exhaustive enumeration cannot be applied. Therefore, many heuristic approaches, such as forward feature selection, backward feature selection, and PCA-based feature selection are used. This study showed that the classification accuracy rates of the PSOLDA were higher than those of these approaches in many public data sets.