Approximation algorithms for k-modes clustering

  • Authors:
  • Zengyou He;Shengchun Deng;Xiaofei Xu

  • Affiliations:
  • Department of Computer Science and Engineering, Harbin Institute of Technology, China;Department of Computer Science and Engineering, Harbin Institute of Technology, China;Department of Computer Science and Engineering, Harbin Institute of Technology, China

  • Venue:
  • ICIC'06 Proceedings of the 2006 international conference on Intelligent computing: Part II
  • Year:
  • 2006

Quantified Score

Hi-index 0.02

Visualization

Abstract

In this paper, we study clustering with respect to the k-modes objective function, a natural formulation of clustering for categorical data. One of the main contributions of this paper is to establish the connection between k- modes and k-median, i.e., the optimum of k-median is at most the twice the optimum of k-modes for the same categorical data clustering problem. Based on this observation, we derive a deterministic algorithm that achieves an approximation factor of 2. Furthermore, we prove that the distance measure in k-modes defines a metric. Hence, we are able to extend existing approximation algorithms for metric k-median to k-modes. Empirical results verify the superiority of our method.