A non-parametric method for data clustering with optimal variable weighting

  • Authors:
  • Ji-Won Chung;In-Chan Choi

  • Affiliations:
  • Department of Industrial Systems and Information Engineering, Korea University, Anamdong, Seongbookku, Seoul, Republic of Korea;Department of Industrial Systems and Information Engineering, Korea University, Anamdong, Seongbookku, Seoul, Republic of Korea

  • Venue:
  • IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since cluster analysis in data mining often deals with large-scale high-dimensional data with masking variables, it is important to remove non-contributing variables for accurate cluster recovery and also for proper interpretation of clustering results. Although the weights obtained by variable weighting methods can be used for the purpose of variable selection (or, elimination), they alone hardly provide a clear guide on selecting variables for subsequent analysis. In addition, variable selection and variable weighting are highly interrelated with the choice on the number of clusters. In this paper, we propose a non-parametric data clustering method, based on the W-k-means type clustering, for an automated and joint decision on selecting variables, determining variable weights, and deciding the number of clusters. Conclusions are drawn from computational experiments with random data and real-life data.