Combination of document priors in web information retrieval

Authors:
Jie Peng;Iadh Ounis
Affiliations:
Department of Computing Science, University of Glasgow, United Kingdom;Department of Computing Science, University of Glasgow, United Kingdom
Venue:
ECIR'07 Proceedings of the 29th European conference on IR research
Year:
2007

Citing 2
Cited 2

Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The Importance of Prior Probabilities for Entry Page Search

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval

High quality expertise evidence for expert search

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Quality-biased ranking of web documents

Proceedings of the fourth ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query independent features (also called document priors), such as the number of incoming links to a document, its PageRank, or the length of its associated URL, have been explored to boost the retrieval effectiveness of Web Information Retrieval (IR) systems. The combination of such query independent features could further enhance the retrieval performance. However, most current combination approaches are based on heuristics, which ignore the possible dependence between the document priors. In this paper, we present a novel and robust method for combining document priors in a principled way. We use a conditional probability rule, which is derived from Kolmogorov's axioms. In particular, we investigate the retrieval performance attainable by our combination of priors method, in comparison to the use of single priors and a heuristic prior combination method. Furthermore, we examine when and how document priors should be combined.