Kernel k-means for categorical data

  • Authors:
  • Julia Couto

  • Affiliations:
  • James Madison University, Harrisonburg, VA

  • Venue:
  • IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clustering categorical data is an important and challenging data analysis task. In this paper, we explore the use of kernel K-means to cluster categorical data. We propose a new kernel function based on Hamming distance to embed categorical data in a constructed feature space where the clustering is conducted. We experimentally evaluated the quality of the solutions produced by kernel K-means on real datasets. Results indicated the feasibility of kernel K-means using our proposed kernel function to discover clusters embedded in categorical data.