Building a question answering test collection

  • Authors:
  • Ellen M. Voorhees;Dawn M. Tice

  • Affiliations:
  • National Institute of Standards and Technology, 100 Bureau Drive, STOP 8940, Gaithersburg, MD;National Institute of Standards and Technology, 100 Bureau Drive, STOP 8940, Gaithersburg, MD

  • Venue:
  • SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The TREC-8 Question Answering (QA) Track was the first large-scale evaluation of domain-independent question answering systems. In addition to fostering research on the QA task, the track was used to investigate whether the evaluation methodology used for document retrieval is appropriate for a different natural language processing task. As with document relevance judging, assessors had legitimate differences of opinions as to whether a response actually answers a question, but comparative evaluation of QA systems was stable despite these differences. Creating a reusable QA test collection is fundamentally more difficult than creating a document retrieval test collection since the QA task has no equivalent to document identifiers.