Examining the role of statistical and linguistic knowledge sources in a general-knowledge question-answering system

Authors:
Claire Cardie;Vincent Ng;David Pierce;Chris Buckley
Affiliations:
Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;SaBIR Research
Venue:
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Year:
2000

Citing 9
Cited 18

MURAX: a robust linguistic approach for question answering using an on-line encyclopedia

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Massive query expansion for relevance feedback

Massive query expansion for relevance feedback
Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing

Communications of the ACM
Question Answering from Frequently Asked Question Files: Experiences with the FAQ Finder System

Question Answering from Frequently Asked Question Files: Experiences with the FAQ Finder System
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
The TIPSTER SUMMAC Text Summarization Evaluation

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Error-driven pruning of Treebank grammars for base noun phrase identification

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

Learning search engine specific query transformations for question answering

Proceedings of the 10th international conference on World Wide Web
Exploiting redundancy in question answering

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Embedded Grammar Tags: Advancing Natural Language Interaction on the Web

IEEE Intelligent Systems
Natural Language Guided Dialogues for Accessing the Web

TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Performance issues and error analysis in an open-domain question answering system

ACM Transactions on Information Systems (TOIS)
Robustness beyond shallowness: incremental deep parsing

Natural Language Engineering
Learning to find answers to questions on the Web

ACM Transactions on Internet Technology (TOIT)
Information extraction with term frequencies

HLT '01 Proceedings of the first international conference on Human language technology research
The role of lexico-semantic feedback in open-domain textual question-answering

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Using machine learning techniques to interpret WH-questions

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
QARAB: a question answering system to support the Arabic language

SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
Concept-based question answering system

Proceedings of the 2007 conference on Human interface: Part I
Probabilistic models for answer-ranking in multilingual question-answering

ACM Transactions on Information Systems (TOIS)
Highly frequent terms and sentence retrieval

SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Semantic query expansion based on a question category concept list

ICADL'04 Proceedings of the 7th international Conference on Digital Libraries: international collaboration and cross-fertilization
Exploiting question concepts for query expansion

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Predicting the performance of passage retrieval for question answering

Proceedings of the 21st ACM international conference on Information and knowledge management
Information extraction as a filtering task

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe and evaluate an implemented system for general-knowledge question answering. The system combines techniques for standard ad-hoc information retrieval (IR), query-dependent text summarization, and shallow syntactic and semantic sentence analysis. In a series of experiments we examine the role of each statistical and linguistic knowledge source in the question-answering system. In contrast to previous results, we find first that statistical knowledge of word co-occurrences as computed by IR vector space methods can be used to quickly and accurately locate the relevant documents for each question. The use of query-dependent text summarization techniques, however, provides only small increases in performance and severely limits recall levels when inaccurate. Nevertheless, it is the text summarization component that allows subsequent linguistic filters to focus on relevant passages. We find that even very weak linguistic knowledge can offer substantial improvements over purely IRbased techniques for question answering, especially when smoothly integrated with statistical preferences computed by the IR subsystems.