Improvement of HITS-based algorithms on web documents

  • Authors:
  • Longzhuang Li;Yi Shang;Wei Zhang

  • Affiliations:
  • University of Missouri-Columbia, Columbia, MO;University of Missouri-Columbia, Columbia, MO;University of Missouri-Columbia, Columbia, MO

  • Venue:
  • Proceedings of the 11th international conference on World Wide Web
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present two ways to improve the precision of HITS-based algorithms on Web documents. First, by analyzing the limitations of current HITS-based algorithms, we propose a new weighted HITS-based method that assigns appropriate weights to in-links of root documents. Then, we combine content analysis with HITS-based algorithms and study the effects of four representative relevance scoring methods, VSM, Okapi, TLS, and CDR, using a set of broad topic queries. Our experimental results show that our weighted HITS-based method performs significantly better than Bharat's improved HITS algorithm. When we combine our weighted HITS-based method or Bharat's HITS algorithm with any of the four relevance scoring methods, the combined methods are only marginally better than our weighted HITS-based method. Between the four relevance-scoring methods, there is no significant quality difference when they are combined with a HITS-based algorithm.