Automatic Topic Identification Using Webpage Clustering

  • Authors:
  • Xiaofeng He;Chris H. Q. Ding;Hongyuan Zha;Horst D. Simon

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grouping webpage into distinct topics is one way to organize the large amount of retrieved information on the web. In this paper, we report that based on similaritymetric which incorporates textual information, hyperlinkstructure and co-citation relations, an unsupervised clustering method can automatically and effectively identify relevant topics, a shown in experiments on several retrieved sets of webpages. The clustering method is a state-of-art spectral graph partitioning method based on normalized cutcriterion first developed for image segmentation.