A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory

  • Authors:
  • Yen-Jen Oyang;Chien-Yu Chen;Tsui-Wei Yang

  • Affiliations:
  • -;-;-

  • Venue:
  • PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses the clustering quality and complexities of the hierarchical data clustering algorithm based on gravity theory. The gravity-based clustering algorithm simulates how the given N nodes in a K-dimensional continuous vector space will cluster due to the gravity force, provided that each node is associated with a mass. One of the main issues studied in this paper is how the order of the distance term in the denominator of the gravity force formula impacts clustering quality. The study reveals that, among the hierarchical clustering algorithms invoked for comparison, only the gravity-based algorithm with a high order of the distance term neither has a bias towards spherical clusters nor suffers the well-known chaining effect. Since bias towards spherical clusters and the chaining effect are two major problems with respect to clustering quality, eliminating both implies that high clustering quality is achieved. As far as time complexity and space complexity are concerned, the gravity-based algorithm enjoys either lower time complexity or lower space complexity, when compared with the most well-known hierarchical data clustering algorithms except single-link.