Searching question and answer archives

Authors:
W. Bruce Croft;Jiwoon Jeon
Affiliations:
University of Massachusetts Amherst;University of Massachusetts Amherst
Venue:
Searching question and answer archives
Year:
2007

Citing 0
Cited 3

Retrieval models for question and answer archives

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Data-driven text features for sponsored search click prediction

Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising
Improving ad relevance in sponsored search

Proceedings of the third ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Archives of questions and answers are a valuable information source. However, little research has been done to exploit them. We propose a new type of information retrieval system that answers users' questions by searching question and answer archives. The proposed system has many advantages over current web search engines. In this system, natural language questions are used instead of keyword queries, and the system directly returns answers instead of lists of documents. Two most important challenges in the implementation of the system are finding semantically similar questions to the user question and estimating the quality of answers. We propose using a translation-based retrieval model to overcome the word mismatch problem between questions. Our model combines the advantages of the IBM machine translation model and the query likelihood language model and shows significantly improved retrieval performance over the state of the art retrieval models. We also show that collections of question and answer pairs are good linguistic resources for learning reliable word-to-word translation relationships. To avoid returning bad answers to users, we build an answer quality predictor based on statistical machine learning techniques. By combining the quality predictor with the translation-based retrieval model, our system successfully returns relevant and high quality answers to the user.