Automatically combining ranking heuristics for HTML documents

Authors:
Joaquin Rapela
Affiliations:
IBM Almaden Research Center, San Jose, CA
Venue:
Proceedings of the 3rd international workshop on Web information and data management
Year:
2001

Citing 6
Cited 4

Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
An experimental study of factors important in document ranking

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
What is a tall poppy among Web pages?

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
A New Study on Using HTML Structures to Improve Retrieval

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence

Choosing document structure weights

Information Processing and Management: an International Journal
An algorithm to cluster documents based on relevance

Information Processing and Management: an International Journal
Integrating Structure in the Probabilistic Model for Information Retrieval

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
An algorithm to cluster documents based on relevance

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current search engines use several criteria or heuristics to rank HTML documents. HTML ranking heuristics need to be combined into a ranking function that given a text query returns a ranked list of HTML documents. The standard approach is to build a weighted average by manually estimating the importance of every heuristic and assigning a weight proportional to the estimated importance. In the current paper we apply an automatic method for combining HTML ranking heuristics. Using recall/precision evaluations we study the performance of the automatic method and using collections of HTML documents with different characteristics we show that the automatic method finds weights tailored to specific characteristics of each document collection