Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Computational Statistics & Data Analysis - Nonlinear methods and data mining
Parsimonious language models for information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Estimation and use of uncertainty in pseudo-relevance feedback
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A cluster-based resampling method for pseudo-relevance feedback
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Geometric representations for multiple documents
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Positional relevance model for pseudo-relevance feedback
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Learning-Based pseudo-relevance feedback for patent retrieval
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
High performance query expansion using adaptive co-training
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Pseudo relevance feedback (PRF) is one of effective practices in Information Retrieval. In particular, PRF via the relevance model (RM) has been widely used due to the theoretical soundness and effectiveness. In a PRF scenario, an underlying relevance model is inferred by combining language models of the top retrieved documents where the contribution of each document is assumed to be proportional to its score for the initial query. However, it is not clear that selecting the top retrieved documents only by the initial retrieval scores is actually the optimal way for query expansion. We show that the initial score of a document is not a good indicator of its effectiveness in query expansion. Our experiments show that if we can estimate the true effectiveness of the top retrieved documents, we can obtain almost 50% improvement over RM. Based on this observation, we introduce various document features that can be used to estimate the effectiveness of documents. Our experiments on the TREC Robust collection show that the proposed features make good predictors, and PRF using the effectiveness predictors can achieve statistically significant improvements over RM.