Question answering using statistical language modelling

  • Authors:
  • Matthias H. Heie;Edward W. D. Whittaker;Sadaoki Furui

  • Affiliations:
  • Department of Computer Science, Tokyo Institute of Technology, Tokyo 152-8552, Japan;Department of Computer Science, Tokyo Institute of Technology, Tokyo 152-8552, Japan;Department of Computer Science, Tokyo Institute of Technology, Tokyo 152-8552, Japan

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a statistical approach to question answering (QA). Our motivation is to build robust systems for many languages without the need for highly tuned linguistic modules. Consequently, word tokens and web data are used extensively but neither explicit linguistic knowledge nor annotated data is incorporated. A mathematical model for answer retrieval and answer classification is derived. Experiments are conducted by searching for answers in the AQUAINT corpus, as well as in web data. The redundancy inherent in web data outperforms retrieval from a fixed corpus, where there are typically relatively few answer occurrences for any given question. We participated with an implementation of this framework in the TREC 2006 QA evaluations, where we ranked 9th among 27 participants on the factoid task.