HITS algorithm improvement using anchor-related text extracted by DOM structure analysis

  • Authors:
  • Yoshinori Hijikata;Bui Quang Hung;Masanori Otsubo;Shogo Nishida

  • Affiliations:
  • Osaka University, Osaka, Japan;Osaka University, Osaka, Japan;Osaka University, Osaka, Japan;Osaka University, Osaka, Japan

  • Venue:
  • Proceedings of the 2009 ACM symposium on Applied Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Kleinberg's HITS algorithm is a popular algorithm to rank web pages. One of its problems is the topic drift problem. Previous researchers have tried to solve this problem using anchor-related text. We proposed another type of anchor-related text in our previous study. This is found by executing a deep analysis on the DOM structures of web pages. We call our anchor-related text DOM-based anchor-related text (DOM-text). In this paper, we investigate the effectiveness of using DOM-text for improving the HITS algorithm. We examine how much we can improve the HITS algorithm. We also compare DOM-text with anchor-related text of other kinds. The experimental results show that the use of DOM-text is the best for improving the HITS algorithm.