Finding similar questions in collaborative question answering archives: toward bootstrapping-based equivalent pattern learning

  • Authors:
  • Tianyong Hao;Eugene Agichtein

  • Affiliations:
  • Department of Chinese, Translation and Linguistics, City University of Hong Kong, Hong Kong, China;Mathematics and Computer Science Department, Emory University, Atlanta, USA

  • Venue:
  • Information Retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many questions submitted to Collaborative Question Answering (CQA) sites have similar questions answered before. We propose a precise approach of automatically finding an answer to such questions by automatically identifying "equivalent" questions submitted and answered, in the past. Our method is based on automatically generating equivalent question patterns by grouping together questions that have previously obtained the same answers. The generated patterns are used as seed patterns to match more questions to extract large number of equivalent patterns by a new bootstrapping-based learning method. The resulting patterns can be applied to match a new question to an equivalent one that has already been answered, and thus suggest potential answers automatically. We experimented with this approach over a large collection of more than 200,000 real questions drawn from the Yahoo! Answers archive, automatically acquiring over 16,991 groups of equivalent question patterns. These patterns allow our method to obtain over 57% recall and over 54% precision on suggesting an answer automatically to new questions, significantly improving over baseline methods.