A Clustering Algorithm Based on Generalized Stars

  • Authors:
  • Airel Pérez Suárez;José E. Pagola

  • Affiliations:
  • Advanced Technologies Application Centre (CENATAV), 7a #21812 e/ 218 y 222, Rpto. Siboney, Playa. C.P. 12200, C. Habana, Cuba;Advanced Technologies Application Centre (CENATAV), 7a #21812 e/ 218 y 222, Rpto. Siboney, Playa. C.P. 12200, C. Habana, Cuba

  • Venue:
  • MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a new algorithm for document clustering called Generalized Star (GStar). This algorithm is a generalization of the Star algorithm proposed by Aslam et al., and recently improved by them and other researchers. In this method we introduced a new concept of star allowing a different star-shaped form with better overlapping clusters. The evaluation experiments on standard document collections show that the proposed algorithm outperforms previously defined methods and obtains a smaller number of clusters. Since the GStar algorithm is relatively simple to implement and is also efficient, we advocate its use for tasks that require clustering, such as information organization, browsing, topic tracking, and new topic detection.