Exploiting syntactic structure of queries in a language modeling approach to IR

  • Authors:
  • Munirathnam Srikanth;Rohini Srihari

  • Affiliations:
  • State University of New York at Buffalo, Buffalo, NY;State University of New York at Buffalo, Buffalo, NY

  • Venue:
  • CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural Language Processing (NLP) techniques have been explored to enhance the performance of Information Retrieval (IR) methods with varied results. Most efforts in using NLP techniques have been to identify better index terms for representing documents. This use in the indexing phase of IR has implicit effect on retrieval performance. However, the explicit use of NLP techniques during the retrieval or information seeking phase has been restricted to interactive or dialogue systems. Recent advances in IR are based on using Statistical Language Models (SLM) to represent documents and ranking them based on their model generating a given user query. This paper presents a novel method for using NLP techniques on user queries, specifically, a syntactic parse of a query, in the statistical language modeling approach to IR. In the proposed method, named Concept Language Models, a query is viewed as a sequence of concepts and a concept as a sequence terms. The paper presents different approximations to estimate the concept and term probabilities and compute the query likelihood estimate for documents. Some empirical results on TREC test collections comparing Concept Language Models with smoothed N-gram language models are presented.