Architecture and evaluation of BRUJA, a multilingual question answering system

Authors:
M. Á. García-Cumbreras;F. Martínez-Santiago;L. A. Ureña-López
Affiliations:
SINAI Research Group, Computer Science Department, University of Jaén, Jaén, Spain;SINAI Research Group, Computer Science Department, University of Jaén, Jaén, Spain;SINAI Research Group, Computer Science Department, University of Jaén, Jaén, Spain
Venue:
Information Retrieval
Year:
2012

Citing 18
Cited 0

Searching distributed collections with inference networks

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Querying across languages: a dictionary-based approach to multilingual information retrieval

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The impact of database selection on distributed searching

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Database merging strategy based on logistic regression

Information Processing and Management: an International Journal
English-Dutch CLIR Using Query Translation Techniques

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Report on CLEF-2001 Experiments: Effective Combined Query-Translation Approach

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval

Information Retrieval
Natural language question answering: the view from here

Natural Language Engineering
Toward semantics-based answer pinpointing

HLT '01 Proceedings of the first international conference on Human language technology research
Learning question classifiers

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
The structure and performance of an open-domain question answering system

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A merging strategy proposal: The 2-step retrieval status value method

Information Retrieval
Does pseudo-relevance feedback improve distributed information retrieval systems?

Information Processing and Management: an International Journal
Enhancing Cross-Language Question Answering by Combining Multiple Question Translations

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
BRUJA: question classification for Spanish. Using machine translation and an English classifier

MLQA '06 Proceedings of the Workshop on Multilingual Question Answering
Probabilistic models for answer-ranking in multilingual question-answering

ACM Transactions on Information Systems (TOIS)
Overview of the CLEF 2005 multilingual question answering track

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Overview of the CLEF 2004 multilingual question answering track

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a user question, the goal of a Question Answering (QA) system is to retrieve answers rather than full documents or even best-matching passages, as most Information Retrieval systems currently do. In this paper, we present BRUJA, a QA system for the management of multilingual collections. BRUJ rkstions (English, Spanish and French). The BRUJA architecture is not formed with three monolingual QA systems but instead uses English as Interlingua to make usual QA tasks such as question classifications and answer extractions. In addition, BRUJA uses Cross Language Information Retrieval (CLIR) techniques to retrieve relevant documents from a multilingual collection. On the one hand, we have more documents to find answers from but on the other hand, we are introducing noise into the system because of translations to the Interlingua (English) and the CLIR module. The question is whether the difficulty of managing three languages is worth it or whether a monolingual QA system delivers better results. We report on in-depth experimentation and demonstrate that our multilingual QA system gets better results than its monolingual counterpart whenever it uses good translation resources and, especially, CLIR techniques that are state-of-the-art.