An analysis of a high-performance japanese question answering system

Authors:
Hideki Isozaki
Affiliations:
NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2005

Citing 8
Cited 2

A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
Exploiting redundancy in question answering

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Passage retrieval vs. document retrieval for factoid question answering

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Japanese named entity extraction evaluation: analysis of results

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Efficient support vector classifiers for named entity recognition

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
SVM answer selection for open-domain question answering

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Sentence level discourse parsing using syntactic and lexical information

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1

Automatically Acquiring Causal Expression Patterns from Relation-annotated Corpora to Improve Question Answering for why-Questions

ACM Transactions on Asian Language Information Processing (TALIP)
Ontology-based query processing for understanding intentions of indirect speech acts in natural-language question answering

International Journal of Computer Applications in Technology

Quantified Score

Hi-index	0.02

Visualization

Abstract

Twenty-five Japanese Question Answering systems participated in NTCIR QAC2 subtask 1. Of these, our system SAIQA-QAC2 performed the best: MRR = 0.607. SAIQA-QAC2 is an improvement on our previous system SAIQA-Ii that achieved MRR = 0.46 for QAC1. We mainly improved the answer-type determination module and the retrieval module. In general, a fine-grained answer taxonomy improves QA performance but it is difficult to build an accurate answer extraction module for the fine-grained taxonomy because Machine Learning methods require a huge training corpus and hand-crafted rules are hard to maintain. Therefore, we built a fine-grained system by using a coarse-grained named entity recognizer and a Japanese lexicon “Nihongo Goi-taikei.” Our experiments show that named entity/numerical expression recognition and word sense-based answer extraction mainly contributed to the performance. In addition, we developed a new proximity-based document retrieval module that performs better than BM25. We also compared its performance with MultiText, a conventional proximity-based retrieval method developed for QA.