Utilizing hyperlink transitivity to improve web page clustering

  • Authors:
  • Jingyu Hou;Yanchun Zhang

  • Affiliations:
  • Department of Mathematics and Computing, University of Southern Queensland, Toowoomba, Qld 4350, Australia;Department of Mathematics and Computing, University of Southern Queensland, Toowoomba, Qld 4350, Australia

  • Venue:
  • ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The rapid increase of web complexity and size makes web searched results far from satisfaction in many cases due to a huge amount of information returned by search engines. How to find intrinsic relationships among the web pages at a higher level to implement efficient web searched information management and retrieval is becoming a challenge problem. In this paper, we propose an approach to measure web page similarity. This approach takes hyperlink transitivity and page importance into consideration. From this new similarity measurement, an effective hierarchical web page clustering algorithm is proposed. The primary evaluations show the effectiveness of the new similarity measurement and the improvement of web page clustering. The proposed page similarity, as well as the matrix-based hyperlink analysis methods, could be applied to other web-based research areas.