Examining the role of statistical and linguistic knowledge sources in a general-knowledge question-answering system

  • Authors:
  • Claire Cardie;Vincent Ng;David Pierce;Chris Buckley

  • Affiliations:
  • Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;SaBIR Research

  • Venue:
  • ANLC '00 Proceedings of the sixth conference on Applied natural language processing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe and evaluate an implemented system for general-knowledge question answering. The system combines techniques for standard ad-hoc information retrieval (IR), query-dependent text summarization, and shallow syntactic and semantic sentence analysis. In a series of experiments we examine the role of each statistical and linguistic knowledge source in the question-answering system. In contrast to previous results, we find first that statistical knowledge of word co-occurrences as computed by IR vector space methods can be used to quickly and accurately locate the relevant documents for each question. The use of query-dependent text summarization techniques, however, provides only small increases in performance and severely limits recall levels when inaccurate. Nevertheless, it is the text summarization component that allows subsequent linguistic filters to focus on relevant passages. We find that even very weak linguistic knowledge can offer substantial improvements over purely IRbased techniques for question answering, especially when smoothly integrated with statistical preferences computed by the IR subsystems.