Data dimensionality reduction with application to improving classification performance and explaining concepts of data sets

  • Authors:
  • Xiuju Fu;Lipo Wang

  • Affiliations:
  • Institute of High Performance Computing, Science Park 2, 117528, Singapore.;School of Electrical and Electronic Engineering, Nanyang Technological University, Block S1, Nanyang Avenue, 639798, Singapore

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data dimensionality reduction is usually carried out before patterns are input to classifiers. In order to obtain good results in data mining, selecting relevant data is desirable. In many cases, irrelevant or redundant attributes are included in data sets, which interfere with knowledge discovery from data sets. In this paper, we propose a rule-extraction method based on a novel separability-correlation measure (SCM) ranking the importance of attributes. According to the attribute ranking results, the attribute subsets that lead to the best classification results are selected and used as inputs to a classifier, such as an RBF neural network in our paper. The complexity of the classifier can thus be reduced and its classification performance improved. Our method uses the classification results with reduced attribute sets to extract rules. Computer simulations show that our method leads to smaller rule sets with higher accuracies compared with other methods.