Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The Importance of Prior Probabilities for Entry Page Search
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance weighting for query independent evidence
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Multinomial randomness models for retrieval with document fields
ECIR'07 Proceedings of the 29th European conference on IR research
Extending weighting models with a term quality measure
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Content-based relevance estimation on the web using inter-document similarities
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Query-independent features (also called document priors), such as the number of incoming links to a document, its Page-Rank, or the type of its associated URL, have been successfully integrated into Web Information Retrieval systems in order to enhance the retrieval effectiveness. The combination of several document priors could further enhance the retrieval performance. However, most current combination of priors approaches are based on heuristics, and often ignore the possible dependence between the document priors. In this paper, we present a novel and robust method for conditionally combining document priors in a principled way. The approach adjusts the distribution of document priors for one source of evidence according to the distribution of document priors for other sources of evidence. We investigate the retrieval performance attainable by our combination of priors method, in comparison to the use of single priors and to a heuristic combination of document priors method, which assumes that document priors are independent. Furthermore, we investigate how sensitive the proposed method is to the training data. Using two standard Web test collections, including the large-scale. GOV2 test collection, we find that some of the document priors used in our experiments, have a considerably high correlation, suggesting that the dependency between documents priors should indeed be taken into account. Through extensive experiments on these two large-scale collections, we observe that our proposed conditional combination method is overall effective and robust.