An unsupervised feature selection framework based on clustering

  • Authors:
  • Sheng-yi Jiang;Lian-xi Wang

  • Affiliations:
  • School of Informatics, Guangdong University of Foreign Studies, Guangzhou, China;School of Informatics, Guangdong University of Foreign Studies, Guangzhou, China

  • Venue:
  • PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature selection plays an important part in improving the quality of learning algorithms in machine learning and data mining. It has been widely studied in supervised learning, whereas it is still relatively rare researched in unsupervised learning. In this work, a clustering-based framework formed by an unsupervised feature selection algorithm is proposed. The proposed framework is mainly concerned with the problem of determining and choosing important features, which are selected by ranking the features according to the importance measure scores, from the original feature set without class information. Theory analyzed indicates that the time complexity of each algorithm is nearly linear with the size and the number of features of dataset. Experimental results on UCI datasets show that algorithm with different scores in the framework are able to identify the important features with clustering, and the proposed algorithm have obtained competitive results in terms of classification error rate and the degree of dimensionality reduction when compared with the state-of-the-art supervised and unsupervised feature selection approaches.