SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
WWW '03 Proceedings of the 12th international conference on World Wide Web
Improving the estimation of relevance models using large external corpora
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Information Systems
Query performance prediction in web search environments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Selecting good expansion terms for pseudo-relevance feedback
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Improved query difficulty prediction for the web
Proceedings of the 17th ACM conference on Information and knowledge management
Finding relevant information of certain types from enterprise data
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Enterprise intranets are often sparse in nature, with limited use of alternative lexical representations between authors, making query expansion (QE) ineffective. Hence, for some enterprise search queries, it can be advantageous to instead use the well-known collection enrichment (CE) method to gather higher quality pseudo-feedback documents from a more diverse external resource. However, it is not always clear for which queries the collection enrichment technique should be applied. In this paper, we study two different approaches, namely a predictor-based approach and a divergence-based approach, to decide on when to apply CE. We thoroughly evaluate both approaches on the TREC Enterprise track CERC test collection and its corresponding topic sets, in combination with three different external resources and nine different query performance predictors. Our results show that both approaches are effective to selectively apply CE for enterprise search. In particular, the divergence-based approach leads to consistent and marked retrieval improvements over the systematic application of QE or CE on all external resources.