Improved latent concept expansion using hierarchical markov random fields

  • Authors:
  • Hao Lang;Donald Metzler;Bin Wang;Jin-Tao Li

  • Affiliations:
  • Chinese Academy of Sciences, Beijing, China;University of Southern California, Marina del Rey, CA, USA;Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most existing query expansion approaches for ad-hoc retrieval adopt overly simplistic textual representations that treat documents as bags of words and ignore inherent document structure. These simple representations often lead to incorrect independence assumptions in the proposed approaches and result in limited retrieval effectiveness. In this paper, we propose a novel query expansion technique that models the various types of dependencies that exist between original query terms and expansion terms within a robust, unified framework. The proposed model is called Hierarchical Markov random fields (HMRFs), based on Latent Concept Expansion (LCE). By exploiting implicit (or explicit) hierarchical structure within documents, HMRFs can incorporate hierarchical interactions which are important for modeling term dependencies in an efficient manner. Our rigorous experimental evaluation carried out using several TREC data sets shows that our proposed query expansion technique consistently and significantly outperforms the current state-of-the-art query expansion approaches, including relevance-based language models and LCE.