A study of selective collection enrichment for enterprise search

  • Authors:
  • Jie Peng;Craig Macdonald;Ben He;Iadh Ounis

  • Affiliations:
  • University of Glasgow, Glasgow, United Kingdom;University of Glasgow, Glasgow, United Kingdom;University of Glasgow, Glasgow, United Kingdom;University of Glasgow, Glasgow, United Kingdom

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Enterprise intranets are often sparse in nature, with limited use of alternative lexical representations between authors, making query expansion (QE) ineffective. Hence, for some enterprise search queries, it can be advantageous to instead use the well-known collection enrichment (CE) method to gather higher quality pseudo-feedback documents from a more diverse external resource. However, it is not always clear for which queries the collection enrichment technique should be applied. In this paper, we study two different approaches, namely a predictor-based approach and a divergence-based approach, to decide on when to apply CE. We thoroughly evaluate both approaches on the TREC Enterprise track CERC test collection and its corresponding topic sets, in combination with three different external resources and nine different query performance predictors. Our results show that both approaches are effective to selectively apply CE for enterprise search. In particular, the divergence-based approach leads to consistent and marked retrieval improvements over the systematic application of QE or CE on all external resources.