Evaluating contents-link coupled web page clustering for web search results

  • Authors:
  • Yitong Wang;Masaru Kitsuregawa

  • Affiliations:
  • the University of Tokyo, Tokyo, Japan;the University of Tokyo, Tokyo, Japan

  • Venue:
  • Proceedings of the eleventh international conference on Information and knowledge management
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is currently one of the most crucial techniques for dealing (e.g. resources locating, information interpreting) with massive amount of heterogeneous information on the web. Unlike clustering in other fields, web page clustering separates unrelated pages and clusters related pages (to a specific topic) into semantically meaningful groups, which is useful for discrimination, summarization, organization and navigation of unstructured web pages. We have proposed a contents-link coupled clustering algorithm that clusters web pages by combining contents and link analysis. In this paper, we particularly study the effects of out-links (from the web pages), in-links (to the web page) and terms on the final clustering results as well as how to effectively combine these three parts to improve the quality of clustering results. We apply it to cluster web search results. Preliminary experiments and evaluations are conducted on various topics. As the experimental results show, the proposed clustering algorithm is effective and promising.