Looking under the hood: tools for diagnosing your question answering engine

  • Authors:
  • Eric Breck;Marc Light;Gideon S. Mann;Ellen Riloff;Brianne Brown;Pranav Anand;Mats Rooth;Michael Thelen

  • Affiliations:
  • The MITRE Corporation, Bedford, MA;The MITRE Corporation, Bedford, MA;Johns Hopkins University, Baltimore;University of Utah, Salt Lake City, UT;Bryn Mawr College, Bryn Mawr, PA;Harvard University, Cambridge, MA;Cornell University, Ithaca, NY;University of Utah, Salt Lake City, UT

  • Venue:
  • ODQA '01 Proceedings of the workshop on Open-domain question answering - Volume 12
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we analyze two question answering tasks: the TREC-8 question answering task and a set of reading comprehension exams. First, we show that Q/A systems perform better when there are multiple answer opportunities per question. Next, we analyze common approaches to two subproblems: term overlap for answer sentence identification, and answer typing for short answer extraction. We present general tools for analyzing the strengths and limitations of techniques for these sub-problems. Our results quantify the limitations of both term overlap and answer typing to distinguish between competing answer candidates.