A New Graph-Based Algorithm for Clustering Documents

  • Authors:
  • Airel Pérez Suárez;José Fco. Martínez Trinidad;Jesús Ariel Carrasco Ochoa;José E. Medina Pagola

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDMW '08 Proceedings of the 2008 IEEE International Conference on Data Mining Workshops
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like Generalized Star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms.The CStar algorithm uses the Condensed Star-shaped Sub-graph concept defined by ACONS, but defines a new heuristic that allows to construct a new cover of the thresholded similarity graph and to reduce the drawbacks presented in GStar and ACONS algorithms. The experimentation over standard document collections shows that our proposal outperforms previously defined algorithms and other related algorithms used to document clustering.