On Combining Link and Contents Information for Web Page Clustering

  • Authors:
  • Yitong Wang;Masaru Kitsuregawa

  • Affiliations:
  • -;-

  • Venue:
  • DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is currently one of the most crucial techniques for dealing (e.g. resources locating, information interpreting) with massive amount of heterogeneous information on the web, which is beyond human being's capacity to digest. In this paper, we discuss the shortcomings of pervious approaches and present a unifying clustering algorithm to cluster web search results for a specific query topic by combining link and contents information. Especially, we investigate how to combine link and contents analysis in clustering process to improve the quality and interpretation of web search results .The proposed approach automatically clusters the web search results into high quality, semantically meaningful groups in a concise, easy-to-interpret hierarchy with tagging terms. Preliminary experiments and evaluations are conducted and the experimental results show that the proposed approach is effective and promising. Keywords: co-citation, coupling, anchor window, snippet