Using symbolic objects to cluster web documents

  • Authors:
  • Esteban Meneses;Oldemar Rodríguez-Rojas

  • Affiliations:
  • Costa Rica Institute of Technology, Cartago, Costa Rica;University of Costa Rica, San José, Costa Rica

  • Venue:
  • Proceedings of the 15th international conference on World Wide Web
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web Clustering is useful for several activities in the WWW, from automatically building web directories to improve retrieval performance. Nevertheless, due to the huge size of the web, a linear mechanism must be employed to cluster web documents. The k-means is one classic algorithm used in this problem. We present a variant of the vector model to be used with the k-means algorithm. Our representation uses symbolic objects for clustering web documents. Some experiments were done with positive results and future work is optimistic.