K-centers algorithm for clustering mixed type data

  • Authors:
  • Wei-Dong Zhao;Wei-Hui Dai;Chun-Bin Tang

  • Affiliations:
  • Software School, Fudan University, Shanghai, China;School of Management, Fudan University, Shanghai, China;School of Management, Fudan University, Shanghai, China

  • Venue:
  • PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The K-modes and K-prototypes algorithms both apply the frequency-based update method for centroids, regarding attribute values with the highest frequency but neglecting other attribute values, which affects the accuracy of clustering results. To solve this problem, the K-centers clustering algorithm is proposed to handle mixed type data. As the extension to the K-prototypes algorithms, hard and fuzzy K-centers algorithm, focusing on effects of attribute values with different frequencies on clustering accuracy, a new update method for centroids is proposed in this paper. Experiments on many UCI machine-learning databases show that the K-centers algorithm can cluster categorical and mixed-type data more efficiently and effectively than the K-modes and K-prototypes algorithms.