Automatically combining ranking heuristics for HTML documents

  • Authors:
  • Joaquin Rapela

  • Affiliations:
  • IBM Almaden Research Center, San Jose, CA

  • Venue:
  • Proceedings of the 3rd international workshop on Web information and data management
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current search engines use several criteria or heuristics to rank HTML documents. HTML ranking heuristics need to be combined into a ranking function that given a text query returns a ranked list of HTML documents. The standard approach is to build a weighted average by manually estimating the importance of every heuristic and assigning a weight proportional to the estimated importance. In the current paper we apply an automatic method for combining HTML ranking heuristics. Using recall/precision evaluations we study the performance of the automatic method and using collections of HTML documents with different characteristics we show that the automatic method finds weights tailored to specific characteristics of each document collection