Clustering high dimensional data: A graph-based relaxed optimization approach

  • Authors:
  • Chi-Hoon Lee;Osmar R. Zaïane;Ho-Hyun Park;Jiayuan Huang;Russell Greiner

  • Affiliations:
  • Computing Science Department, University of Alberta, Canada;Computing Science Department, University of Alberta, Canada;School of Electrical and Electronics Engineering, Chung Ang University, South Korea;School of Computer Science, University of Waterloo, Canada;Computing Science Department, University of Alberta, Canada

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.07

Visualization

Abstract

There is no doubt that clustering is one of the most studied data mining tasks. Nevertheless, it remains a challenging problem to solve despite the many proposed clustering approaches. Graph-based approaches solve the clustering task as a global optimization problem, while many other works are based on local methods. In this paper, we propose a novel graph-based algorithm ''GBR'' that relaxes some well-defined method even as improving the accuracy whilst keeping it simple. The primary motivation of our relaxation of the objective is to allow the reformulated objective to find well distributed cluster indicators for complicated data instances. This relaxation results in an analytical solution that avoids the approximated iterative methods that have been adopted in many other graph-based approaches. The experiments on synthetic and real data sets show that our relaxation accomplishes excellent clustering results. Our key contributions are: (1) we provide an analytical solution to solve the global clustering task as opposed to approximated iterative approaches; (2) a very simple implementation using existing optimization packages; (3) an algorithm with relatively less computation time over the number of data instances to cluster than other well defined methods in the literature.