Learning to find answers to questions on the Web

  • Authors:
  • Eugene Agichtein;Steve Lawrence;Luis Gravano

  • Affiliations:
  • Columbia University;NEC Research Institute;Columbia University

  • Venue:
  • ACM Transactions on Internet Technology (TOIT)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a method for learning to find documents on the Web that contain answers to a given natural language question. In our approach, questions are transformed into new queries aimed at maximizing the probability of retrieving answers from existing information retrieval systems. The method involves automatically learning phrase features for classifying questions into different types, automatically generating candidate query transformations from a training set of question/answer pairs, and automatically evaluating the candidate transformations on target information retrieval systems such as real-world general purpose search engines. At run-time, questions are transformed into a set of queries, and reranking is performed on the documents retrieved. We present a prototype search engine, Tritus, that applies the method to Web search engines. Blind evaluation on a set of real queries from a Web search engine log shows that the method significantly outperforms the underlying search engines, and outperforms a commercial search engine specializing in question answering. Our methodology cleanly supports combining documents retrieved from different search engines, resulting in additional improvement with a system that combines search results from multiple Web search engines.